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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

Application of Randall et al. 
Serial No. Not yet assigned 
Filed: 

For: USE OF DNA ENCODING PLASTID PYRUVATE DEHYDROGENASE AND 
BRANCHED CHAIN OXOACID DEHYDROGENASE COMPONENTS TO 
ENHANCE POLYHYDROXYALKANOATE BIOSYNTHESIS IN PLANTS 

Examiner: Unknown 

PRELIMINARY AMENDMENT A 

Divisional of Application Serial No. 09/108,020 
Honorable Commissioner of Patents and Trademarks 
Sir: 

Please enter the following amendments: 

IN THE SPECIFICATION 

On page 1 at line 4, after "application" insert —is a divisional application of U.S. 
application Serial No. 09/108,020, filed June 30, 1998, herein incorporated by reference in its 
entirety,— 

On page 1, at line 9 after "March 2, 1998, insert —herein incorporated by reference in 
their entirety- 

At page 20, after line 19 and before "Detailed Description of the Invention", please insert 
the following: 

Figure 8 shows the alignment of the deduced amino acid sequences of PDC El a from 
plastid A.t. (SEQ ID NO: 33), P. purpurea (SEQ ID NO: 34), A. taliana (SEQ ID NO: 35), H. 
sapiens II (SEQ ID NO: 36), S. cerevisiae (SEQ ID NO: 37), A. suum I (SEQ ID NO: 38), M. 
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capricolum (SEQ ID NO: 39), B. subtilis (SEQ ID NO: 40) and consensus sequence (SEQ ID 
NO: 41). Abbreviations are the same as in Figure 6. "*" indicates conserved, "." non-conserved 
phosphorylation sites, "o" indicates the conserved Cys 62 of the mature H.s. El a sequence. 

Figure 9 shows the alignment of the deduced amino acid sequences of PDC El (3 from 
Plastid At. (SEQ ID NO: 42), P. purpurea (SEQ ID NO: 43), A. thaliana (SEQ ID NO: 44), H. 
sapiens (SEQ ID NO: 45), S. cerevisiae (SEQ ID NO: 46), A suum (SEQ ID NO: 47), M. 
capricolum (SEQ ID NO: 48), B. subtilis (SEQ ID NO: 49) and a consensus sequence (SEQ ID 
NO: 50). Abbreviations are the same as in Figure 6. 

Figure 10 shows the alignment of the deduced amino acid sequences of various 
BCOADC Elp subunits, A.t. (SEQ ID NO: 51), Human (SEQ ID NO: 52), Bovine (SEQ ID NO: 
53) and consensus (SEQ ID NO: 54). Abbreviations are the same as in Figure 6. "." indicates 
conserved amino acids; "-" indicates a gap inserted to maximize homology. 

On page 53, at line 26, delete "Tables 2 and 3" and replace with —Figs. 8 and 9— 

On page 54, at line 12, delete "Table 2" and replace with -Fig. 8-- 

On page 54, at line 14, delete "Table 2" and replace with --Fig. 8- 

On page 54, at line 20-21, delete "Table 2" and replace with -Fig. 8-- 

On page 55, at line 4, delete "Table 2" and replace with -Fig. 8- 

On page 55, at line 13, delete "Table 2" and replace with -Fig. 8- 

On page 55, at line 21, delete "Table 2" and replace with -Fig. 8- 

On page 55, at line 31, delete "Table 3" and replace with -Fig. 9- 

On page 56, at line 4, delete "Table 3" and replace with -Fig. 9- 

On page 56, at line 8, delete "Table 3" and replace with -Fig. 9~ 

On page 57, at line 11, delete "(Tables 2 and 3)" and replace with --(Figs. 8 and 9)~ 

On page 68, at line 29, delete "Table 4" and replace with -Fig. 10- 

On page 69, at line 2, delete "Table 4" and replace with -Fig. 10-- 
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IN THE CLAIMS 

Please cancel claims 1-3, 5-7, 9-11, 13-15, 17-19, 25-27. 



Please amend claim 21 as follows. 



21. (Amended) [The] An isolated DNA molecule [of claim 17], comprising a nucleotide 
sequence selected from the group consisting of: 

(a) the nucleotide sequence shown in SEP ID NO: 13, or the complement thereof; 

(b) a nucleotide sequence that hybridizes to said nucleotide sequence of (a) under 
a wash stringency equivalent to Q.5X SSC to 2X SSC. 0.1% SDS. at 55-65 °C. and which 
encodes a polypeptide having enzymatic activity similar to that of Arabidopsis thaliana 
branched chain 2-oxoacid dehydrogenase complex El (3 subunit: 

(c) a nucleotide sequence encoding the same genetic information as said 
nucleotide sequence of (a), but which is degenerate in accordance with the degeneracy of 
the genetic code: and 

fd) a nucleotide sequence encoding the same genetic information as said 
nucleotide sequence of (b\ but which is degenerate in accordance with the degeneracy of 
the genetic code: 

wherein the naturally occurring branched chain oxoacid dehydrogenase complex E2 
component binding region thereof is replaced with the E2 component binding region of a 
plastid pyruvate dehydrogenase complex El|3 subunit. 

Please add the following new claim. 

44. An isolated polypeptide encoded by the DNA molecule of claim 2 1 . 
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REMARKS 

The specification has been amended to properly identify the application as a divisional 
application. The specification has also been amended to correct certain informalities. No new 
matter has been added. 

Claim 21 has been amended to remove its dependancy on cancelled claim 17. Support 
for new claim 44 can be found throughout the specification and in particular in the sections 
beginning on pages 35 and 41 . 

Respectfully submitted, 



N Jgm£s E. Butler, Ph.D. 
Registration No. 40,931 

SENNIGER, POWERS, LEAVITT & ROEDEL 
One Metropolitan Square, 16th Floor 
St. Louis, MO 63102 
(314) 231-5400 



CERTIFICATE OF MAILING 

I certify that the foregoing Preliminary Amendment A is being deposited with the United 
States Postal Service as Express Mail #EL615274325US, in an envelope addressed to: Box 
PATENT APPLICATION, Assistant Commissioner for Patents, Washington, D.C. 20231 on 
this 10th day of October, 2000. 



Mary KayTferr v ) 



JEB/mkd 
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Use of DNA Encoding Plastid Pyruvate Dehydrogenase 
and Branched Chain Oxoacid Dehydrogenase Components 
to Enhance Polyhydroxyalkanoate Biosynthesis in Plants 

This application claims the benefit of priority of 
5 the following Provisional patent applications: Serial 
Number 60/051,291, filed June 30, 1997; Serial Number 
60/055,255, filed August 1, 1997; Serial Number 
60/076,544, filed March 2, 1998; and Serial Number 
60/076,554, filed March 2, 1998. 

10 Background of the Invention 

Field of the Invention 

The present invention relates to genetically 
engineered plants. More particularly, the present 
invention relates to the optimization of substrate pools 
15 to facilitate the biosynthetic production of commercially 
useful polyhydroxyalkanoates (PHAs) in plants. 

The present invention especially relates to the 
production of copolyesters of 3 -hydroxybutyrate (3HB) and 
3-hydroxyvalerate (3HV) , designated P (3HB-co-3HV) 
20 copolymer, and derivatives thereof. 



Description of Related Art 
Polyhydroxyalkanoates 

Polyhydroxyalkanoates are polyesters that accumulate 
in a wide variety of bacteria. These polymers have 
25 properties ranging from stiff and brittle plastics to 
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rubber-like materials, and are biodegradable. Due to 
these properties, PHAs are an attractive source of non- 
polluting plastics and elastomers. 

Currently, there are approximately a dozen 
5 biodegradable plastics in commercial use that possess 
properties suitable for producing a number of specialty 
and commodity products (Lindsay, 19 92) . One such 
biodegradable plastic in the polyhydroxyalkanoate (PHA) 
family that is commercially important is Biopol™, a 

10 random copolymer of 3 - hydroxy but yrate (3HB) and 

3 -hydroxyvalerate (3HV) . This bioplastic is used to 
produce biodegradable molded material (e.g., bottles), 
films, coatings, and in drug release applications. 
Biopol™ is produced via a fermentation process employing 

15 the bacterium Alcaligenes eutrophus (Byrom, 1987) . The 
current market price is $6-7/lb, and the annual 
production is 1,000 tons. By best estimates, this price 
is likely to be reduced only about 2 -fold via 
fermentation (Poirier et al . , 1995). Competitive 

2 0 synthetic plastics such as polypropylene and polyethylene 

cost about 3 5-45C/lb (Layman, 1994) . The annual global 
demand for polyethylene alone is about 3 7 million metric 
tons (Poirier et al . , 1995) . It is therefore likely that 
the cost of producing P (3HB-co-3HV) by microbial 
25 fermentation will restrict its use to low-volume 
specialty applications. 

Nakaraura et al . (1992) reported using threonine 
(20g/L) as the sole carbon source for the production of 
P (3HB-CO-3HV) copolymer in A. eutrophus. These workers 

3 0 initially suggested that the copolymer might form via the 

degradation of threonine by threonine deaminase, with 
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conversion of the resultant a-ketobutyrate (= 2- 
oxobutyrate) to propionyl -CoA. However, they ultimately 
concluded that threonine was utilized directly, without 
breaking carbon- carbon bonds, to form valeryl-CoA as the 
5 3HV precursor. The nature of this chemical conversion 
was not described, but since the breaking of carbon- 
carbon bonds was not postulated to occur, the pathway 
could not involve threonine deaminase in conjunction with 
an a-ketoacid decarboxylating step to form propionate or 

10 propionyl -CoA. In the experiments of Nakamura et al . , 
the PHA polymer content was very low (< 6% of dry cell 
weight) . This result, in conjunction with the expense of 
feeding bacteria threonine, makes their approach 
impractical for the commercial production of P (3HB-co3HV) 

15 copolymer. 

Yoon et al . (1995) have shown that growth of 
Alcaligenes sp . SH-6 9 on a medium supplemented with 
threonine, isoleucine, or valine resulted in significant 
increases in the 3HV fraction of the P (3HB-CO-3HV) 

20 copolymer. In addition to these amino acids, glucose (3% 
wt/vol) was also added to the growth media. In contrast 
to the results obtained by Nakamura et al . (1992), growth 
of A. eutrophus under the conditions described by Yoon et 
al . (1995) did not result in the production of 

25 P (3HB-co-3HV) copolymer when the medium was supplemented 
with threonine as the sole carbon source. From their 
results, Yoon et al . (1995) implied that the synthetic 
pathway for the 3HV component in P (3HB-co-3HV) copolymer 
is likely the same as that described in WO 91/18995 and 

3 0 Steinbiichel and Pieper (1992) . This postulated synthetic 
pathway involves the degradation of isoleucine to 
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propionyl-CoA (Figure 3) . 

The PHB Biosvnthetic Pathway 

Polyhydroxybutyrate (PHB) was first discovered in 
1926 as a constituent of the bacterium Bacillus 
5 megaterium (Lemoigne, 1926) . Since then, PHAs such as PHB 
have been found in more than 90 different genera of 
gram-negative and gram-positive bacteria (Steinbuchel , 
1991) . These microorganisms produce PHAs using 
R-|3-hydroxyacyl-CoAs as the direct metabolic substrate 

10 for a PHA synthase, and produce polymers of 

R- (3) -hydroxyalkanoates having chain lengths ranging from 
C3-C14 (Steinbuchel and Valentin, 1995) . 

To date, the best understood biochemical pathway for 
PHB production is that found in the bacterium Alcaligenes 

15 eutrophus (Dawes and Senior, 1973; Slater et al . , 1988; 
Schubert et al . , 1988; Peoples and Sinskey, 1989a and 
1989b) . This pathway, which is also utilized by other 
microorganisms, is summarized in Figure 1. In this 
organism, an operon encoding three gene products, i.e., 

2 0 PHB synthase, 3 -ketothiolase , and acetoacetyl -CoA 

reductase, encoded by the phbC, phbA, and phbB genes, 
respectively, are required to produce the PHA homopolymer 
R-polyhydroxy-butyrate (PHB) . 

As further shown in Figure 1, acetyl -CoA is the 

25 starting substrate employed in the biosynthetic pathway. 

This metabolite is naturally available for PHB production 
in the cytoplasm and plastids of plants. 

Poirier et al . (1992) demonstrated that a multi- 
enzyme pathway can be introduced into plants to produce 

30 polyhydroxybutyrate (PHB) . In that work, the genes 
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encoding the Alcaligenes eutrophus acetoacetyl -CoA 
reductase (phbB) and PHB synthase {phbC) genes were 
introduced into Arabidopsis thai i ana, where the enzymes 
were expressed cytoplasmically . A 3 -ketothiolase is 
5 already expressed in the cytoplasm of Arabidopsis . 

Although PHB was produced in the plants which expressed 
the three enzymes, the yield was low and the plants were 
stunted and had reduced seed production. 

Nawrath et al . (1994) provided a solution to these 
10 problems. There, the genes for the three bacterial PHB 
enzymes (phbC, phbA, and phbB) were modified to comprise 
a pea chloroplast targeting peptide (=" transit peptide"), 
which targeted the enzymes to the chloroplast. 
Arabidopsis plants which produced these three enzymes in 
15 the chloroplast accumulated large amounts of PHB. There 
was also no apparent affect of these transgenes, or of 
the PHB accumulation, on the growth and development of 
the transgenic plants. 

The P (3HB-CO-3HV) Copolymer Biosvnthetic Pathway 

As noted above, P (3HB-CO-3HV) random copolymer, 
commercially known as Biopol™, is produced by 
fermentation employing A. eutrophus. A proposed 
biosynthetic pathway for P (3HB-co-3HV) copolymer 
production is shown in Figure 2 . Production of this 
polymer in plants has been reported (oral presentation by 
Mitsky et al . , 1997) . 

Since the production of PHB in chloroplasts 
apparently does not affect plant growth and development 
as does production of PHB in the cytoplasm (Nawrath et 
al . , 1992), the chloroplast is the preferred site of 
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P (3HB-CO-3HV) biosynthesis. The successful production of 
P (3HB-co-3HV) copolymer in plants thus requires the 
presence of three PHA biosynthetic enzymes as well as the 
substrates required for the copolymer biosynthesis 
5 (Figure 2), preferably in the plastids. For the 3HB 

component of the polymer, the substrate naturally exists 
in chloroplasts in sufficient concentration in the form 
of acetyl-CoA (Nawrath et al . , 1994) . However, this is 
not true for the 3HV component of the polymer, where the 

10 starting substrate is propionyl-CoA. Figure 3 is an 
overview of enzyme pathways which are related to the 
provision of these substrates. The engineering of plants 
to generate sufficient chloroplast pools of propionyl - 
CoA, along with the proper PHA biosynthetic enzymes 

15 (i.e., a (3 -ketothiolase , a (3-ketoacyl -CoA reductase, and 
a PHA synthase) , makes it possible to produce 
copolyesters of poly (3HB-co-3HV) in these organisms. 

Methods for optimization of PHB and P (3HB-CO-3HV) 
production in various crop plants are disclosed in Gruys 

2 0 et al . (1998) . A major focus in that invention is the 

optimization of the substrate pools for P (3HB-co-3HV) , in 
order to provide 2 -ketobutyrate and propionyl -CoA to the 
site of copolymer synthesis. Gruys et al . (1998) also 
discusses exploring the potential use of a pyruvate 

2 5 dehydrogenase complex and a branched chain oxoacid 

dehydrogenase complex to convert 2 -oxobutyrate to 
propionyl -CoA. 

Gruys et al . (1998) also provides methods for the 
optimization of |3-ketothiolase , (3-ketoacyl-CoA reductase, 

3 0 and PHA synthase activities in plants and bacteria. It 

was determined therein that the A. eutrophus (3- 
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ketothiolase PhbB was metabolically blocked from 
producing P (3HB-co-3HV) due to its inability to utilize 
propionyl-CoA with acetyl -CoA to produce 3 -ketovaleryl - 
CoA (see Figure 2). However, Gruys et al . (1998) 
5 demonstrated that another A. eutrophus 

3 -ketothiolase, designated BktB, is able to produce 
3 -ketovaleryl -CoA from propionyl -CoA and acetyl -CoA. 
Therefore, BktB is a preferred |3-ketothiolase for the 
production of P (3HB-co-3HV) . Gruys et al . also 

10 demonstrated that other (3-ketothiolases are able to 

produce 3 -keto-valeryl -CoA from propionyl -CoA and acetyl - 
CoA. These are: another A. eutrophus (3-ketothiolase , 
designated pAE65, and two p-ketothiolases from Zoogloea 
ramigera, designated "A" and "B" . 

15 Gruys et al . (1998) noted that the sources of the 

three copolymer biosynthetic enzymes may encompass a wide 
range of organisms, including, for example, Alca.ligen.es 
eutrophus, Alcaligenes faecalis, Aphanothece sp . , 
Azotobacter vinelandii , Bacillus cereus, Bacillus 

20 megaterium, Beijerinkia indica, Derxia gummosa, 
Methylobacterium sp., Microcoleus sp. , Nocardia 
corallina, Pseudomonas cepacia, Pseudomonas extorquens , 
Pseudomonas oleovorans , Rhodobacter sphaeroides , 
Rhodobacter capsulatus , Rhodo spirillum rubrum (Brandl et 

25 al . , 1990; Doi, 1990), and Thiocapsa pfennig! i . 



Pyruvate Dehydrogenase Complex 

The pyruvate dehydrogenase complex (PDC) is a large 
mult i- enzyme structure composed of three primary 
component enzymes, pyruvate dehydrogenase (PDH) (El, EC 
30 1.2.41); dihydrolipoamide acetyltransf erase (E2, EC 
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2.3.1.12); and dihydrolipoamide dehydrogenase (E3, EC 
1.8.1.4) (Reed, 1974). In the well -characterized 
mammalian complex, 60 subunits of E2 comprise the central 
core, and the El and E3 components decorate the outer 
5 surface of this core (Patel et al . , 1990). El is a 

heterotetramer composed of two oc and two £ subunits. The 
E3 component, a homodimer, associates with the complex 
via an E3 binding protein (Gopalakrishnan, 198 9) . 
The PDC catalyzes the irreversible oxidative 
10 decarboxylation of pyruvate according to the equation: 

Pyruvate + CoA + NAD + - Acetyl -CoA + CO a + NADH + H + 

In mitochondria, this reaction represents the 
irreversible commitment of carbon to the citric acid 
cycle, and therefore is a logical point for regulation. 

15 Previous experiments have shown that plant mitochondrial 
PDC activity is, in fact, regulated by product 
inhibition, metabolites, and reversible phosphorylation 
(Randall et al . , 1977; Randall et al . , 1989; Randall et 
al . , 1996; Budde et al, 1991) as is the mammalian complex 

20 (Patel et al . , 1990). 

In prokaryotes, PDC is localized in the cytoplasm, 
while in eukaryotes it is within the mitochondrial 
matrix. Plants, however, are unique in that a second 
form of the complex exists in the plastids (Reid et al . , 

25 1975; Reid et al . , 1977; Thompson et al , 1977b). Based 

upon enzymology (Thompson et al. , 1977a; Williams et al . , 
1979; Camp et al . , 1988) and immunochemical analyses 
(Taylor et al . , 1992; Camp et al, 1985) it is clear that 
plastid PDC is distinct from its mitochondrial 
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counterpart. In plants, de novo fatty acid biosynthesis 
occurs exclusively in the plastids (Miernyk et al . , 1983; 
Kang et al . , 1994; Zilket et al . , 1969; Drennan et al . , 
1969; Ohlrogge et al . , 1979). The plastid form of PDC 
5 can provide the fatty acid precursor, acetyl -CoA (Miernyk 
et al., 1983; Kang et al . , 1994; Grof et al., 1995). The 
plastid PDC can also catalyze the oxidative 
decarboxylation of 2 -oxobutyrate to produce propionyl Co- 
A (Camp et al . , 1988; Camp and Randall, 1985). 

10 The cDNAs that encode the Ela and El (3 subunits of 

plant mitochondrial PDH have been cloned (Grof et al . , 
1995; Leuthy et al . , 1995; Leuthy et al , 1994). 
Recently, Reith and Munholland (1995) reported the 
sequence of the entire plastid genome of the red alga P. 

15 purpurea. Encoded in this genome are open reading frames 
homologous to PDH a and 3 subunits. 

The cDNAs that encode the E2 component of the plant 
mitochondrial PDC have been similarly cloned (Guan et 
al . , 1995) . The sequence of the entire plastid genome of 

2 0 the cyanobacterium Synechocystis sp . has also recently 
been reported (Kaneko et al . , 1996) . 



Branched Chain 2-Oxoacid Dehydrogenase Complex 

The branched chain 2-oxoacid dehydrogenase complex 
(BCOADC) is a highly ordered macromolecular structure 
2 5 composed of three primary component enzymes, a branched 
chain dehydrogenase or decarboxylase (BCDH or El; EC 
1.2.4.4); dihydrolipoamide transacylase (LTA or E2 ; no EC 
number); and dihydrolipoamide dehydrogenase (LipDH or E3 ; 
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EC 1.8.1.4) (Yeaman, 1989). The mammalian complex is 
assembled with 24 subunits of E2 as the central cubic 
core with 4:3:2 symmetry; the El and E3 components 
decorate the outer surface of the E2 core (Yeaman, 198 9; 
5 Wynn et al . , 1996) . El is a heterotetramer composed of 
two identical a and two identical 3 subunits (Pettit et 
al . , 1978) . E3 associates loosely with the E2-E1 
structure, and is a homodimer (Wynn et al . , 1996; Pettit 
et al . , 1978) . The mammalian mitochondrial complex is 

10 also regulated by a specific El-kinase and a phospho-El 
phosphatase, which modulate activity by reversible 
phosphorylation ( inactivation) and dephosphorylation 
(reactivation) . Additional regulation is achieved by 
product inhibition and modulation of gene expression 

15 (Yeaman, 1989; Wynn et al . , 1996). 

BCOADC catalyzes the irreversible oxidative 
decarboxylation of the branched- chain 2-oxoacids derived 
from valine, leucine and isoleucine, as well as 2 - 
oxobutyrate and 4-methyl-2-oxobutyrate, with comparable 

20 rates and similar Km values (Yeaman 1989; Wynn et al . , 
1996; Paxton et al . , 1986; Gerbling et al., 1988). The 
reactions are: 

2 -oxo- 3 -methyl valerate + CoA + NAD + -» 2-methylbutyryl-CoA + C0 2 + NADH + H + 

2-oxo-isovalerate + CoA + NAD + - isobutyryl-CoA + C0 2 + NADH + H + 
2 5 2 -oxo-isocaproiate + CoA + NAD* - isovalyryl-CoA + C0 2 + NADH + H + 
2 -oxobutyrate + CoA + NAD + - propionyl-CoA + C0 2 + NADH + H + 



In mammals, BCOADC is found in the mitochondria and 
is involved in the catabolism of the branched- chain amino 
acids. The only reports describing BCOADC activity in 
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plants have localized BCOADC to peroxisomes (Gerbling et 
al . , 1988; Gerbling et al . , 1989). The proposed function 
of a peroxisomal BCOADC is to catabolize the branched- 
chain amino acids during germination and growth, yielding 
5 an acyl-CoA product that would be further metabolized by 
the beta-oxidation pathway localized in the peroxisome 
(Gerbling et al . , 1988; Gerbling et al . , 1989). 

To provide substrate pools to permit biosynthesis of 
P (3HB-co-3HV) copolymer in the plastid, there is a need 
10 for methods to engineer plants to produce plastid enzymes 
which convert 2 -oxobutyrate to propionyl -CoA. 

Summary of the Invention 

Accordingly, the present invention provides 
nucleotide sequences that encode the Ela and E13 
15 subunits, and the E2 component of the plastid pyruvate 
dehydrogenase complex, as well as the Ela and El f> 
subunits, and the E2 component of the branched chain 
oxoacid dehydrogenase complex, of Arabidopsls thaliana. 
Methods of utilizing these nucleotide sequences to 

2 0 provide enzymatic activity to convert 2 -oxo-butyrate to 

propionyl -CoA, and to produce P (3HB-co- 3HV) copolymer in 
plants, are also provided. 

Accordingly, in a first aspect, the present 
invention provides an isolated DNA molecule, comprising a 
25 nucleotide sequence selected from: (a) the nucleotide 

sequence shown in SEQ ID NO:l, or the complement thereof ; 
(b) a nucleotide sequence that hybridizes to the 
nucleotide sequence of (a) under a wash stringency 
equivalent to 0 . 5X SSC to 2X SSC, 0.1% SDS, at 55-65°C, 

3 0 and which encodes a polypeptide having enzymatic activity 
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similar to that of Arabidopsis thai i ana plastid pyruvate 
dehydrogenase complex Ela subunit; (c) a nucleotide 
sequence encoding the same genetic information as the 
nucleotide sequence of (a) , but which is degenerate in 
5 accordance with the degeneracy of the genetic code; and 
(d) a nucleotide sequence encoding the same genetic 
information as the nucleotide sequence of (b) , but which 
is degenerate in accordance with the degeneracy of the 
genetic code. Recombinant vectors comprising such 

10 isolated DNA molecules, host cells transformed with these 
vectors, and an isolated polypeptide having the amino 
acid sequence of SEQ ID NO . : 2 are also provided. 

In another aspect, the present invention provides an 
isolated DNA molecule, comprising a nucleotide sequence 

15 selected from: (a) the nucleotide sequence shown in SEQ 
ID NO: 3, or the complement thereof; (b) a nucleotide 
sequence that hybridizes to the nucleotide sequence of 
(a) under a wash stringency equivalent to 0 . 5X SSC to 2X 
SSC, 0.1% SDS, at 55-65°C, and which encodes a 

2 0 polypeptide having enzymatic activity similar to that of 

Arabidopsis thaliana plastid pyruvate dehydrogenase 
complex El (3 subunit; (c) a nucleotide sequence encoding 
the same genetic information as the nucleotide sequence 
of (a) , but which is degenerate in accordance with the 
25 degeneracy of the genetic code; and (d) a nucleotide 
sequence encoding the same genetic information as the 
nucleotide sequence of (b) , but which is degenerate in 
accordance with the degeneracy of the genetic code. 
Recombinant vectors comprising such isolated DNA 

3 0 molecules, host cells transformed with these vectors, and 

an isolated polypeptide having the amino acid sequence of 
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SEQ ID NO.: 4 are also provided. 

In another aspect , the present invention provides an 
isolated DNA molecule, comprising a nucleotide sequence 
selected from: (a) the nucleotide sequence shown in SEQ 
5 ID NO: 5, or the complement thereof; (b) a nucleotide 

sequence that hybridizes to the nucleotide sequence of 
(a) under a wash stringency equivalent to 0 . 5X SSC to 2X 
SSC, 0.1% SDS, at 55-65°C, and which encodes a 
polypeptide having enzymatic activity similar to that of 

10 Arabidopsis thaliana plastid pyruvate dehydrogenase 

complex E2 component; (c) a nucleotide sequence encoding 
the same genetic information as the nucleotide sequence 
of (a) , but which is degenerate in accordance with the 
degeneracy of the genetic code; and (d) a nucleotide 

15 sequence encoding the same genetic information as the 
nucleotide sequence of (b) , but which is degenerate in 
accordance with the degeneracy of the genetic code. 
Recombinant vectors comprising such isolated DNA 
molecules, host cells transformed with these vectors, and 

2 0 an isolated polypeptide having the amino acid sequence of 
SEQ ID NO.: 6 are also provided. 

In a further aspect, the present invention provides 
an isolated DNA molecule, comprising a nucleotide 
sequence selected from: (a) the nucleotide sequence shown 

2 5 in SEQ ID NO: 11, or the complement thereof; (b) a 

nucleotide sequence that hybridizes to the nucleotide 
sequence of (a) under a wash stringency equivalent to 
0.5X SSC to 2X SSC , 0.1% SDS, at 55-65°C, and which 
encodes a polypeptide having enzymatic activity similar 

3 0 to that of Arabidopsis thaliana branched chain 2-oxoacid 

dehydrogenase complex Ela subunit; (c) a nucleotide 
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sequence encoding the same genetic information as the 
nucleotide sequence of (a) , but which is degenerate in 
accordance with the degeneracy of the genetic code; and 
(d) a nucleotide sequence encoding the same genetic 
5 information as the nucleotide sequence of (b) , but which 
is degenerate in accordance with the degeneracy of the 
genetic code. Recombinant vectors comprising such 
isolated DNA molecules, host cells transformed with these 
vectors, and an isolated polypeptide having the amino 
10 acid sequence of SEQ ID NO.: 12 are also provided. 

In another aspect, the present invention provides an 
isolated DNA molecule, comprising a nucleotide sequence 
selected from: (a) the nucleotide sequence shown in SEQ 
ID NO: 13, or the complement thereof; (b) a nucleotide 
15 sequence that hybridizes to the nucleotide sequence of 

(a) under a wash stringency equivalent to 0 . 5X SSC to 2X 
SSC, 0.1% SDS, at 55-65°C, and which encodes a 
polypeptide having enzymatic activity similar to that of 
Arabidopsis tha.lia.na. branched chain 2-oxoacid 

2 0 dehydrogenase complex El 3 subunit; (c) a nucleotide 

sequence encoding the same genetic information as the 
nucleotide sequence of (a) , but which is degenerate in 
accordance with the degeneracy of the genetic code; and 
(d) a nucleotide sequence encoding the same genetic 
25 information as the nucleotide sequence of (b) , but which 
is degenerate in accordance with the degeneracy of the 
genetic code. Recombinant vectors comprising such 
isolated DNA molecules, host cells transformed with these 
vectors, and an isolated polypeptide having the amino 

3 0 acid sequence of SEQ ID NO. : 14 are also provided. 

In another aspect, the present invention provides 
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the foregoing isolated DNA molecules encoding a 
polypeptide having enzymatic activity similar to that of 
Arabidopsis thaliana branched chain 2-oxoacid 
dehydrogenase complex El 3 subunit, but in which the 
5 naturally occurring branched chain oxoacid dehydrogenase 
complex E2 component binding region thereof is replaced 
with the E2 component binding region of a plastid 
pyruvate dehydrogenase complex El (3 subunit. The 
plastid pyruvate dehydrogenase complex El (3 subunit can 

10 have the sequence shown in SEQ ID NO . : 3 . Recombinant 
vectors comprising such isolated DNA molecules, host 
cells transformed with these vectors, and the isolated 
polypeptide are also provided. 

In yet another aspect, the present invention 

15 provides an isolated DNA molecule, comprising a 

nucleotide sequence selected from: (a) the nucleotide 
sequence shown in SEQ ID NO: 15, or the complement 
thereof; (b) a nucleotide sequence that hybridizes to the 
nucleotide sequence of (a) under a wash stringency 

20 equivalent to 0 . 5X SSC to 2X SSC, 0.1% SDS, at 55-65°C, 

and which encodes a polypeptide having enzymatic activity 
similar to that of Arabidopsis thaliana branched chain 2- 
oxoacid dehydrogenase complex E2 component; (c) a 
nucleotide sequence encoding the same genetic information 

25 as the nucleotide sequence of (a) , but which is 

degenerate in accordance with the degeneracy of the 
genetic code; and (d) a nucleotide sequence encoding the 
same genetic information as the nucleotide sequence of 
(b) , but which is degenerate in accordance with the 

3 0 degeneracy of the genetic code. Recombinant vectors 
comprising such isolated DNA molecules, host cells 
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transformed with these vectors, and an isolated 
polypeptide having the amino acid sequence of SEQ ID 
NO.: 16 are also provided. 

In another aspect, the present invention provides 
5 a plant, a plastid of which comprises the following 

polypeptides: an enzyme that enhances the biosynthesis of 
2-oxobutyrate; a branched chain oxoacid dehydrogenase 
complex Ela subunit; a branched chain oxoacid 
dehydrogenase complex Elp subunit; and a branched chain 

10 oxoacid dehydrogenase complex E2 component. The branched 
chain oxoacid dehydrogenase complex Ela subunit can have 
the sequence shown in SEQ ID NO.: 12, the branched chain 
oxoacid dehydrogenase complex El(3 subunit can have the 
sequence shown in SEQ ID NO . : 14 , or the branched chain 

15 oxoacid dehydrogenase complex E2 component can have the 
sequence shown in SEQ ID NO. :16. In such plant, the 
plastid can further comprise the following polypeptides: 
a (3-keto-thiolase; a |3-ketoacyl-CoA reductase; and a 
polyhydroxy-alkanoate synthase. The genome of such plant 

2 0 can comprise introduced DNAs encoding these 

polypeptides, wherein each of the introduced DNAs is 
operatively linked to a targeting peptide coding region 
capable of directing transport of the polypeptide encoded 
thereby into a plastid. A method of producing P(3HB-co- 

2 5 3HV) copolymer comprises growing such plant, and 

recovering P (3HB-CO-3HV) copolymer produced thereby. 

In another aspect, the present invention comprises a 
plant, a plastid of which comprises the following 
polypeptides: an enzyme that enhances the biosynthesis of 

3 0 2-oxobutyrate; a branched chain oxoacid dehydrogenase 

complex Ela subunit; a branched chain oxoacid 
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dehydrogenase complex El [3 subunit; a branched chain 
oxoacid dehydrogenase complex E2 component; and a 
dihydrolipoamide dehydrogenase E3 component, which can be 
mitochondrially-derived. In such plant, the branched 
5 chain oxoacid dehydrogenase complex Elof subunit can have 
the sequence shown in SEQ ID NO.: 12, the branched chain 
oxoacid dehydrogenase complex ElfJ subunit can have the 
sequence shown in SEQ ID NO.: 14, or the branched chain 
oxoacid dehydrogenase complex E2 component can have the 

10 sequence shown in SEQ ID NO.:16. In such plant, the 

plastid can further comprise the following polypeptides: 
a (3-keto-thiolase; a (3 -ketoacyl-CoA reductase; and a 
polyhydroxy-alkanoate synthase. The genome of such plant 
can comprise introduced DNAs encoding these polypeptides, 

15 wherein each of the introduced DNAs is operatively linked 
to a targeting peptide coding region capable of directing 
transport of the polypeptide encoded thereby into a 
plastid. A method of producing P (3HB-co-3HV) copolymer 
comprises growing such plant, and recovering P(3HB-co- 

2 0 3HV) copolymer produced thereby. 

In yet another aspect, the present invention 
provides a plant, a plastid of which comprises the . 
following polypeptides: an enzyme that enhances the 
biosynthesis of 

2 5 2 -oxobutyrate ; a branched chain oxoacid dehydrogenase 

complex El a subunit; and a branched chain oxoacid 
dehydrogenase complex El(3 subunit, the naturally 
occurring E2 binding region of which is replaced with the 
E2 binding region of a plastid pyruvate dehydrogenase 

3 0 complex El(3 subunit. In such plant, the branched chain 

oxoacid dehydrogenase complex Ela subunit can have the 
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sequence shown in SEQ ID NO.: 12. Furthermore, in such 
plant, the plastid can further comprise the following 
polypeptides : 

a 3-ketothiolase; a 3-ketoacyl-CoA reductase; and a poly- 
5 hydroxyalkanoate synthase. In such plant, the genome can 
comprise introduced DNAs encoding these polypeptides, 
wherein each of the introduced DNAs is operatively linked 
to a targeting peptide coding region capable of directing 
transport of the polypeptide encoded thereby into a 
10 plastid. 

A method of producing P (3HB-CO-3HV) copolymer comprises 
growing such plant, and recovering P (3HB-co-3HV) 
copolymer produced thereby. 

Further scope of the applicability of the present 

15 invention will become apparent from the detailed 

description and drawings provided below. However, it 
should be understood that the following detailed 
description and examples, while indicating preferred 
embodiments of the invention, are given by way of 

2 0 illustration only since various changes and modifications 
within the spirit and scope of the invention will become 
apparent to those skilled in the art from this detailed 
description . 

Brief Description of the Drawings 

2 5 The above and other objects, features, and 

advantages of the present invention will be better 
understood from the following detailed description taken 
in conjunction with the accompanying drawings, all of 
which are given by way of illustration only, and are not 

3 0 limitative of the present invention, in which: 
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Figure 1 shows the biochemical steps involved in the 
production of PHB from acetyl-CoA catalyzed by the A. 
eutrophus PHB biosynthetic enzymes. 

Figure 2 shows the biochemical steps involved in the 
5 production of P (3HB-co-3HV) copolymer from acetyl-CoA 
and propionyl-CoA catalyzed by PHA biosynthetic enzymes 
of A. eutrophus. 

Figure 3 summarizes the pathways discussed herein 
that are involved in the production of P (3HB-co-3HV) 
10 copolymer, including enzymes that can be used to enhance 
2 -oxobutyrate biosynthesis. 

Figure 4 shows Southern analyses of genomic DNA 
isolated from mature A. thaliana leaves. Each lane was 
loaded with 10 //g of DNA digested with BamHI , Hind III, 
15 Sal I, Eco RI or Xba I as indicated. Fig. 5A and 5B, 
genomic Southern blots hybridized with random primed 
probes generated from gel -excised El ex and El|3 cDNAs 
respectively. (a 32 P) -dCTP was incorporated using an 
oligolabelling kit (Pharmacia, Uppsala, Sweden) . The 

2 0 positions of X DNA markers digested with Hind III are 

indicated to the left of the figure. 

Figure 5 shows Northern blot analyses of A. thaliana 
RNA. Total RNA was isolated from young leaves of A. 
thaliana plants. 10 /ig of total RNA was run on 
25 formaldehyde gels then transferred to nylon membranes. 
Probes were prepared as described in the legend for 
Figure 5. RNA markers were used to determine the sizes 
of the hybridizing bands. 

Figures 6A and 6B show dendrogram analyses of the 

3 0 deduced amino acid sequence of PDH Ela and El (3 subunits, 

respectively. Abbreviations and accession numbers to the 
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sequences are: P. p., Porphyra purpurea, odp (U38804) ; S. 
sp. , Synechocystis sp. (D90915) ; A. t. , Arabidopsis 
thaliana (U21214, U09137) ; P. a., Pisum sativum (U51918, 
U56697) ; H. s., Homo sapiens (L13318, D90086) ; R. r. , 
5 Rattus rattus (Z12158, P49432); S. c, Saccharomyces 

cerevisiae (P16387, M98476) ; A. s., Ascaris suum (M76554, 
M38017) ; M. gen., Mycoplasma genetalium (U39706); M. c, 
Mycoplasma capricolum (U62057J ; B . su. , Bacillus subtilis 
(M57435) ; and B . s. , Bacillus stearothermophilus 

10 (X53 5 60) . Dendrogram analyses was accomplished with 
GeneWorks CLUSTAL V method (IntelliGenetics, Mountain 
View, CA) . CLUSTAL V parameters were as follows: cost to 
open gap = 5, cost to lengthen gap = 25, gap penalty = 3, 
number of top diagonals = 5, window size = 5, PAM matrix 

15 = PAM250, K- tuple = 1, and consensus cutoff = 50%. 

Figures 7A-7E shows schematics (Constructs 1-5) for 
engineering the BCOADC subunits to be targeted to the 
plastid and to form a hybrid complex, as described in 
Examples 6 and 7. 



2 0 Detailed Description of the Invention 

The following detailed description is provided to 
aid those skilled in the art in practicing the present 
invention. Even so, this detailed description should not 
be construed to unduly limit the present invention as 
25 modifications and variations in the embodiments discussed 
herein can be made by those of ordinary skill in the art 
without departing from the spirit or scope of the present 
inventive discovery. 
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The contents of each of the references cited herein, 
including those of the references cited within these 
primary references, are herein incorporated by reference 
in their entirety. 
5 The production of P (3HB-co-3HV) in plants requires 

the substrates propionyl-CoA and acetyl -CoA, and three 
enzymes which convert these substrates to P (3HB-CO-3HV) : 
a 3-ketothiolase , a [3-ketoacyl-CoA reductase, and a PHA 
synthase. (3 -ketothiolase is normally present in the 

10 plant cytoplasm, but not in the plastids . Acetyl-CoA is 
normally present in the cytoplasm and the plastids. All 
of the other required components must be introduced into 
the plant, preferably into the plastids. 

Gruys et al . (1998) discusses several ways in which 

15 2 -oxobutyrate can be provided in the plant. One way is 
through the manipulation of various wild-type and/or 
deregulated enzymes involved in the biosynthesis of 
aspartate family amino acids in order to increase 
threonine levels, thereby creating a larger substrate 

2 0 pool for threonine deaminase to convert to 2 -oxobutyrate 
(Figure 3), and wild-type or deregulated forms of 
enzymes, such as threonine deaminase, involved in the 
conversion of threonine to P (3HB-CO-3HV) copolymer 
endproduct . Enzymes which can be manipulated to enhance 

2 5 the threonine pool include aspartate kinase, homoserine 

dehydrogenase, and threonine synthase. The threonine 
substrate pool can be enhanced by over express ion of these 
enzymes, or by the use of deregulated forms of these 
enzymes, such as lysine -deregulated aspartate kinase. 

3 0 Threonine deaminase, which converts threonine to 

2 -oxobutyrate , is another enzyme which can be utilized in 
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the production of 2 -oxobutyrate . Deregulated mutants and 
natural deregulated forms of threonine deaminase can be 
used to increase 2 -oxobutyrate pools at the site of 
copolymer biosynthesis. 

5 Gruys et al . (1998), at Example 6, also discuss 

several ways in which the PDC and/or the BCOADC, or their 
substrate pools, can be manipulated to provide effective 
conversion of 2 -oxobutyrate to propionyl-CoA. The native 
plastid PDC is able to perform this conversion at a low 

10 level. However, this complex can provide levels of 

propionyl-CoA sufficient for P (3HB-co-3HV) if the levels 
of 2 -oxobutyrate are sufficient, or if portions of the 
BCOADC are employed to form a hybrid complex. The 
plastid PDC might also be genetically manipulated to be 

15 more effective in providing propionyl -CoA (Gruys et al . , 
1998) . 

The present invention provides nucleotide sequences 
that encode the Ela and El 3 subunits, and the E2 
component, of the plastid pyruvate dehydrogenase complex, 

2 0 and the Ela and El (3 subunits, and the E2 component, of 
the branched chain oxoacid dehydrogenase complex of 
Arabidopsis thaliana. These nucleotide sequences and the 
enzymatic polypeptides encoded thereby can be introduced 
into plants in various combinations with coding sequences 

2 5 for the foregoing enzymes in order to enhance the 
conversion of threonine to 

2 -oxobutyrate, propionate, propionyl -CoA, (3-ketovaleryl - 
CoA, and (3-hydroxyvaleryl-CoA. Introduction into such 
plants of nucleic acid sequences encoding an appropriate 
30 [3-keto-thiolase, a (3 -ketoacyl -CoA reductase, and a PHA 



23 UMO 1482.1 

PATENT 

synthase will permit such transgenic plants to utilize 
the increased g-hydroxyvaleryl-CoA substrate in the 
production of 
P (3HB-CO-3HV) copolymer. 

5 Definitions 

The following definitions are provided to aid those 
skilled in the art in understanding the detailed 
description of the present invention. 

" (3 -ketoacyl -CoA reductase" refers to a 

10 (3 -ketoacyl -CoA reducing enzyme that can convert a 
|3 -ketoacyl -CoA substrate to its corresponding 
P-hydroxyacyl-CoA product using, for example, NADH or 
NADPH as the reducing cosubstrate . An example is the 
PhbB acetoacetyl -CoA reductase of A. eutrophus . 

15 " (3-ketothiolase" refers to an enzyme that catalyzes 

the thiolytic cleavage of a (3 -ketoacyl -CoA, requiring 
free CoA, to form two acyl-CoA molecules. However, the 
term 3-ketothiolase as used herein also refers to 
enzymes that catalyze the condensation of two acyl-CoA 

20 molecules to form 3 -ketoacyl -CoA and free CoA, i.e., the 
reverse of the thiolytic cleavage reaction. 
"CoA" refers to coenzyme A. 

"C-terminal" refers to the region of a peptide, 
polypeptide, or protein chain from the middle thereof to 

2 5 the end that carries the amino acid having a free a 

carboxyl group . 

"Deregulated enzyme" refers to an enzyme that has 
been modified, for example by mutagenesis, wherein the 
extent of feedback inhibition of the catalytic activity 

3 0 of the enzyme by a metabolite is reduced such that the 
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enzyme exhibits enhanced activity in the presence of said 
metabolite compared to the unmodified enzyme. Some 
organisms possess deregulated forms of such enzymes as 
the naturally occurring, wild-type form. 
5 The term "DNA encoding" or "encoding DNA" refers to 

chromosomal DNA, plasmid DNA, cDNA, plastid DNA, or 
synthetic DNA which codes for expression for any of the 
enzymes discussed herein. 

The term "genome" as it applies to bacteria 

10 encompasses both the chromosome and plasmids within a 

bacterial host cell. Unless specified, the term "genome" 
as it applies to plant cells encompasses not only 
chromosomal or nuclear DNA found within the nucleus, but 
organellar DNA found within subcellular components of the 

15 cell. DNAs of the present invention introduced into 
plant cells can therefore be either 

chromosomal ly- integrated or organelle-localized, unless 
specified (e.g. "plastid genome") . 

The term "mutein" refers to a mutant form of a 

20 peptide, polypeptide, or protein. 

"N- terminal" refers to the region of a peptide, 
polypeptide, or protein chain from the amino acid having 
a free a -amino group to the middle of the chain. 

"Operably linked" refers to two amino acid or 

2 5 nucleotide sequences wherein one of the sequences 
operates to affect a characteristic of the other 
sequence. In the case of nucleotide sequences, for 
example, a promoter "operably linked" to a structural 
coding sequence acts to drive expression of the latter. 

30 "Overexpression" refers to the expression of a 

polypeptide or protein encoded by a DNA introduced into a 
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host cell, wherein said polypeptide or protein is either 
not normally present in the host cell, or wherein said 
polypeptide or protein is present in said host cell at a 
higher level than that normally expressed from the 
5 endogenous gene encoding said polypeptide or protein. 

The term "plastid" refers to the class of plant cell 
organelles that includes amyloplasts, chloroplasts , 
chromoplasts , elaioplasts, eoplasts, etioplasts, 
leucoplasts, and proplastids. These organelles are 

10 self -replicating, and contain what is commonly referred 
to as the chloroplast genome, a circular DNA molecule 
that ranges in size from about 120 to about 217 kb, 
depending upon the plant species, and which usually 
contains an inverted repeat region (Fosket, 1994) . 

15 The term "polyhydroxyalkanoate (PHA) synthase" 

refers to enzymes that convert [3-hydroxyacyl -CoAs to 
polyhydroxy- 
alkanoates and free CoA. 

"Targeting sequence" refers to a nucleotide sequence 

20 which, when expressed (forming a "targeting peptide"), 
directs the export of an attached polypeptide to a 
particular cellular location, such as the chloroplast 
{e.g. "chloroplast targeting sequence"). The words 
"signal" or "transit" are equivalent to "targeting" in 

25 this context. 

Production of Transgenic Plants Capable of Producing 
P (3HB-CO-3HV) Copolymer 

PHA synthesis in plants can be optimized in 
3 0 accordance with the present invention by expressing DNAs 
encoding (3 -ketothiolase , |3-acyl-CoA reductase, and PHA 
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synthase in conjunction with various portions and 
combinations of precursor-producing enzymes, including 
the sequences encoding portions of the plastid PDC and 
the BCOADC provided herein, as discussed in the Examples 
5 below. 

Plant Vectors 

In plants, transformation vectors capable of 
introducing encoding DNAs involved in PHA biosynthesis 
are easily designed, and generally contain one or more 

10 DNA coding sequences of interest under the 

transcriptional control of 5' and 3' regulatory 
sequences. Such vectors generally comprise, operatively 
linked in sequence in the 5' to 3 ' direction, a promoter 
sequence that directs the transcription of a downstream 

15 heterologous structural DNA in a plant; optionally, a 5' 
non- translated leader sequence; a nucleotide sequence 
that encodes a protein of interest; and a 3' 
non-translated region that encodes a polyadenylation 
signal which functions in plant cells to cause the 

2 0 termination of transcription and the addition of 

polyadenylate nucleotides to the 3 1 end of the mRNA 
encoding said protein. Plant transformation vectors also 
generally contain a selectable marker. Typical 5 '-3' 
regulatory sequences include a transcription initiation 

25 start site, a ribosome binding site, an RNA processing 
signal, a transcription termination site, and/or a 
polyadenylation signal . Vectors for plant transformation 
have been reviewed in Rodriguez et al . (1988), Glick et 
al. (1993), and Croy (1993). 
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Plant Promoters 

Plant promoter sequences can be constitutive or 
inducible, environmentally- or developmentally- regulated, 
or cell- or tissue-specific. Often-used constitutive 
5 promoters include the CaMV 35S promoter (Odell et al . , 

1985) , the enhanced CaMV 35S promoter, the Figwort Mosaic 
Virus (FMV) promoter (Richins et al., 1987), the 
mannopine synthase (mas) promoter, the nopaline synthase 
(rzos) promoter, and the octopine synthase (ocs) promoter. 

10 Useful inducible promoters include heat-shock promoters 
(Ou-Lee et al . , 1986; Ainley et al . , 1990), a 
nitrate- inducible promoter derived from the spinach 
nitrite reductase gene (Back et al . , 1991), 
hormone -inducible promoters (Yamaguchi-Shinozaki et al . , 

15 1990; Kares et al . , 1990), and light- inducible promoters 
associated with the small subunit of RuBP carboxylase and 
LHCP gene families (Kuhlemeier et al . , 1989; Feinbaum et 
al., 1991; Weisshaar et al . , 1991; Lam and Chua, 1990; 
Castresana et al . , 1988; Schulze-Lef ert et al . , 1989). 

20 Examples of useful tissue-specific, 

developmentally-regulated promoters include the 
P-conglycinin 7S promoter (Doyle et al . , 1986; Slighton 
and Beachy, 1987) , and seed-specific promoters (Knutzon 
et al . , 19 92; Bustos et al . , 1991; Lam and Chua, 1991; 

25 Stayton et al . , 1991). Plant functional promoters useful 
for preferential expression in seed plastids include 
those from plant storage protein genes and from genes 
involved in fatty acid biosynthesis in oilseeds. 
Examples of such promoters include the 5 1 regulatory 

30 regions from such genes as napin (Kridl et al . , 1991), 
phaseolin, zein, soybean trypsin inhibitor, ACP, 
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stearoyl-ACP desaturase, and oleosin. Seed-specific gene 
regulation is discussed in EP 0 255 378. Promoter 
hybrids can also be constructed to enhance 
transcriptional activity (Hoffman, U.S. Patent No. 
5 5,106,739), or to combine desired transcriptional 
activity and tissue specificity. 

A factor to be considered in the choice of promoters 
is the timing of availability of the necessary substrates 
during expression of the PHA biosynthetic enzymes. For 

10 example, if P (3HB-co-3HV) copolymer is produced in seeds 
from threonine, the timing of threonine biosynthesis and 
the amount of free threonine are important 
considerations. Karchi et al . (1994) have -reported that 
threonine biosynthesis occurs rather late in seed 

15 development, similar to the timing of seed storage 

protein accumulation. For example, if enzymes involved 
in P (3HB-CO-3HV) copolymer biosynthesis are expressed 
from the 7S seed-specific promoter, the timing of 
expression thereof will be concurrent with threonine 

2 0 accumulation. 

Plant Transformation and Regeneration 

A variety of different methods can be employed to 
introduce such vectors into plant protoplasts, cells, 
callus tissue, leaf discs, meristems, etc., to generate 

2 5 transgenic plants, including Agrobactez-ium-mediated 

transformation, particle gun delivery, microinjection, 
electroporation, polyethylene glycol -mediated protoplast 
transformation, liposome -mediated transformation, etc. 
(reviewed in Potrykus, 1991) . In general, transgenic 

3 0 plants comprising cells containing and expressing DNAs 
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encoding enzymes facilitating PHA biosynthesis can be 
produced by transforming plant cells with a DNA construct 
as described above via any of the foregoing methods; 
selecting plant cells that have been transformed on a 
5 selective medium; regenerating plant cells that have been 
transformed to produce differentiated plants; and 
selecting a transformed plant which expresses the 
enzyme -encoding nucleotide sequence. 

Constitutive overexpression of, for example, a 

10 deregulated threonine deaminase employing the CaMV 35S or 
FMV promoter might potentially starve plants of certain 
amino acids, especially those of the aspartate family. 
If such starvation occurs, the negative effects may be 
avoided by supplementing the growth and cultivation media 

15 employed in the transformation and regeneration 
procedures with appropriate amino acids . By 
supplementing the transformation/regeneration media with 
aspartate family amino acids (aspartate, threonine, 
lysine, and methionine) , the uptake of these amino acids 

2 0 into the plant can reduce any potential starvation effect 
caused by an overexpressed threonine deaminase. 
Supplementation of the media with such amino acids might 
thereby prevent any negative selection, and therefore any 
adverse effect on transformation frequency, due to the 

2 5 overexpression of a deregulated threonine deaminase in 

the transformed plant. 

The encoding DNAs can be introduced either in a 
single transformation event (all necessary DNAs present 
on the same vector) , a co-transformation event (all 

3 0 necessary DNAs present on separate vectors that are 

introduced into plants or plant cells simultaneously) , or 
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by independent transformation events (all necessary DNAs 
present on separate vectors that are introduced into 
plants or plant cells independently) . Traditional 
breeding methods can subsequently be used to incorporate 
5 the entire pathway into a single plant. Successful 

production of the PHA polyhydroxybutyrate in cells of 
Arabidopsis has been demonstrated by Poirier et al . 
(1992), and in plastids thereof by Nawrath et al . (1994) . 
Specific methods for transforming a wide variety of 

10 dicots and obtaining transgenic plants are well 

documented in the literature (Gasser and Fraley, 1989; 
Fisk and Dandekar, 1993; Christou, 1994; and the 
references cited therein) . 

Successful transformation and plant regeneration 

15 have been achieved in the monocots as follows: asparagus 
{Asparagus officinalis; Bytebier et al . 1987); barley 
(Hordeum vulgarae; Wan and Lemaux 1994) ; maize (Zea mays; 
Rhodes et al . , 1988; Gordon-Kamm et al., 1990; Fromm et 
al . , 1990; Koziel et al . , 1993); oats {Avena sativa; 

20 Somers et al . , 1992); orchardgrass (Dactylis glomerata; 

Horn et al . , 1988); rice {Oryza sativa, including indica 
and japonica varieties; Toriyama et al . , 1988; Zhang et 
al . , 1988; Luo and Wu 1988; Zhang and Wu 1988; Christou 
et al . , 1991); rye {Secale cereale; De la Pena et al . , 

25 1987); sorghum {Sorghum hicolor; Cassas et al . 1993); 

sugar cane {Saccharum spp . ; Bower and Birch 1992); tall 
fescue {Festuca arundinacea; Wang et al . 1992); turfgrass 
{Agrostis palustris; Zhong et al . , 1993); and wheat 
{Triticum aestivum; Vasil et al . 1992; Weeks et al . 1993; 

30 Becker et al . 1994). 
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Host Plants 

Particularly useful plants for PHA copolymer 
production include those that produce carbon substrates 
which can be employed for PHA biosynthesis, including 
5 tobacco, wheat, potato, Arabidopsis, and high oil seed 
plants such as corn, soybean, canola, oil seed rape, 
sunflower, flax, and peanut. Polymers that can be 
produced in this manner include copolymers incorporating 
both short chain length and medium chain length monomers, 

10 such as P (3HB-co-3HV) copolymer. 

If the host plant of choice does not produce the 
requisite fatty acid substrates in sufficient quantities, 
it can be modified, for example by mutagenesis or genetic 
transformation, to block or modulate the glycerol ester 

15 and fatty acid biosynthesis or degradation pathways so 
that it accumulates the appropriate substrates for PHA 
production . 



Plastid Targeting of Expressed Enzymes for PHA 
Biosynthesis 

2 0 PHA polymer can be produced in plants either by 

expression of the appropriate enzymes in the cytoplasm 
(Poirier et al . , 1992) by the methods described above, or 
more preferably, in plastids, where higher levels of PHA 
production can be achieved (Nawrath et al . , 1994) . As 

2 5 demonstrated by the latter group, targeting of 

3-ketothiolase, acetoacetyl -CoA reductase, and PHB 
synthase to plastids of Arabidopsis thai i ana. results in 
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the accumulation of high levels of PHB in the plastids 
without any readily apparent deleterious effects on plant 
growth and seed production. As branched- chain amino acid 
biosynthesis occurs in plant plastids (Bryan, 1980; 
5 Galili, 1995), overexpression therein of plastid-targeted 
enzymes, including a deregulated form of threonine 
deaminase, is expected to facilitate the production of 
elevated levels of 2 -oxobutyrate and propionyl-CoA. 
The latter can be condensed with acetyl -CoA by 

10 p-ketothiolase to form 3 -ketovaleryl -CoA, which can 
then be further metabolized by a (3-keto-acyl-CoA 
reductase to 3 -hydroxyvaleryl-CoA, the precursor of the 
C5 subunit of P (3HB-CO-3HV) copolymer. As there is a 
high carbon flux through acetyl-CoA in plastids, 

15 especially in seeds of oil-accumulating plants such as 
oilseed rape (Brassica napus) , canola (Brassica rapa, 
Bras si ca campestris, Brassica carinata, and Brassica 
juncea) , soybean {Glycine max) , flax (Linum 
usi tatissimum) , and sunflower {Helianthus annuus) for 

2 0 example, targeting of the gene products of desired 

encoding DNAs to leucoplasts of seeds, or 
transformation of seed leucoplasts and expression 
therein of these encoding DNAs , are attractive 
strategies for achieving high levels of PHA 
25 biosynthesis in plants. 

All of the enzymes discussed herein can be 
modified for plastid targeting by employing plant cell 
nuclear transformation constructs wherein DNA coding 
sequences of interest are fused to any of the available 

3 0 transit peptide sequences capable of facilitating 

transport of the encoded enzymes into plant plastids 
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(partially summarized in von Heijne et al . , 1991), and 
driving expression by employing an appropriate 
promoter. The sequences that encode a transit peptide 
region can be obtained, for example, from plant 
5 nuclear-encoded plastid proteins, such as the small 
subunit (SSU) of ribulose bisphosphate carboxylase, 
plant fatty acid biosynthesis related genes including 
acyl carrier protein (ACP) , stearoyl-ACP desaturase, 
(3-ketoacyl-ACP synthase and acyl-ACP thioesterase, or 

10 LHCPII genes. The encoding sequence for a transit 

peptide effective in transport to plastids can include 
all or a portion of the encoding sequence for a 
particular transit peptide, and may also contain 
portions of the mature protein encoding sequence 

15 associated with a particular transit peptide. Numerous 
examples of transit peptides that can be used to 
deliver target proteins into plastids exist, and the 
particular transit peptide encoding sequences useful in 
the present invention are not critical as long as 

20 delivery into a plastid is obtained. Proteolytic 

processing within the plastid then produces the mature 
enzyme. This technique has proven successful not only 
with enzymes involved in PHA synthesis (Nawrath et al . , 
19 94) , but also with neomycin phosphotransferase II 

25 (NPT-II) and CP4 EPSPS (Padgette et al . , 1995), for 
example . 

Of particular interest are transit peptide 
sequences derived from enzymes known to be imported 
into the leucoplasts of seeds. Examples of enzymes 
3 0 containing useful transit peptides include those 

related to lipid biosynthesis (e.g., subunits of the 
plastid- targeted dicot acetyl -CoA carboxylase, biotin 
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carboxylase, biotin carboxyl carrier protein, 
a-carboxytransf erase, plast id- targeted monocot 
multifunctional acetyl-CoA carboxylase (Mr, 220,000); 
plastidic subunits of the fatty acid synthase complex 
5 (e.g., acyl carrier protein (ACP) , malonyl-ACP 

synthase, KASI, KASII, KASIII, etc.); steroyl-ACP 
desaturase; thioesterases (specific for short, medium, 
and long chain acyl ACP) ; plastid- targeted acyl 
transferases (e.g., glycerol- 3 -phosphate : acyl 

10 transferase) ; enzymes involved in the biosynthesis of 
aspartate family amino acids; phytoene synthase; 
gibberellic acid biosynthesis (e.g., ent-kaurene 
synthases 1 and 2); sterol biosynthesis (e.g., hydroxy 
methyl glutaryl-coA reductase) ; and carotenoid 

15 biosynthesis (e.g., lycopene synthase). 

Exact translational fusions to the transit peptide 
of interest may not be optimal for protein import into 
the plastid. By creating translational fusions of any 
of the enzymes discussed herein to the precursor form 

20 of a naturally imported protein or C-terminal deletions 
thereof, one would expect that such translational 
fusions would aid in the uptake of the engineered 
precursor protein into the plastid. For example, 
Nawrath et al . , (1994) used a similar approach to 

2 5 create the vectors employed to introduce the PHB 

biosynthesis genes of A. eutrophus into Arabidopsis . 

It is therefore fully expected that targeting of 
the enzymes discussed herein to leaf chloroplasts or 
seed plastids such as leucoplasts by fusing transit 

3 0 peptide gene sequences thereto will further enhance in 

vivo conditions for the biosynthesis of PHAs , 
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especially P (3HB-CO-3HV) copolymer, in plants. 

Plastid Transformation for Expression of Enzymes 
Involved in PHA Biosynthesis 

Alternatively, enzymes facilitating the 
biosynthesis of metabolites such as threonine, 

2- oxobutyrate, propionyl-CoA, 3 -ketovaleryl-CoA, 

3- hydroxy-valeryl-CoA, and. PHAs discussed herein can be 
expressed in situ in plastids by direct transformation 
of these organelles with appropriate recombinant 
expression constructs. Constructs and methods for 
stably transforming plastids of higher plants are well 
known in the art (Svab et al . , 1990; Svab et al . , 1993; 
Staub et al . , 1993; Maliga et al . , U.S. Patent No. 
5,451,513; PCT International Publications WO 95/16783, 
WO 95/24492, and WO 95/24493). These methods generally 
rely on particle gun delivery of DNA containing a 
selectable marker in addition to introduced DNA 
sequences for expression, and targeting of the DNA to 
the plastid genome through homologous recombination. 
Transformation of a wide variety of different monocots 
and dicots by particle gun bombardment is routine in 
the art (Hinchee et al . , 1994; Walden and Wingender, 
1995) . 

DNA constructs for plastid transformation 
generally comprise a targeting segement comprising 
flanking DNA sequences substantially homologous to a 
predetermined sequence of a plastid genome, which 
targeting segment enables insertion of DNA coding 
sequences of interest into the plastid genome by 
homologous recombination with said predetermined 
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sequence; a selectable marker sequence, such as a 
sequence encoding a form of plastid 16S ribosomal RNA 
that is resistant to spectinomycin or streptomycin, or 
that encodes a protein which inactivates spectinomycin 
5 or streptomycin (such as the aadA gene) , disposed 

within said targeting segment, wherein said selectable 
marker sequence confers a selectable phenotype upon 
plant cells, substantially all the plastids of which 
have been transformed with said DNA construct; and one 

10 or more DNA coding sequences of interest disposed 
within said targeting segment relative to said 
selectable marker sequence so as not to interfere with 
conferring of said selectable phenotype. In addition, 
plastid expression constructs also generally include a 

15 plastid promoter region and a transcription termination 
region capable of terminating transcription in a plant 
plastid, wherein said regions are operatively linked to 
the DNA coding sequences of interest . 

A further refinement in chloroplast 

2 0 transformation/expression technology that facilitates 

control over the timing and tissue pattern of 
expression of introduced DNA coding sequences in plant 
plastid genomes has been described in PCT International 
Publication WO 95/16783. This method involves the 
25 introduction into plant cells of constructs for nuclear 
transformation that provide for the expression of a 
viral single subunit RNA polymerase and targeting of 
this polymerase into the plastids via fusion to a 
plastid transit peptide. Transformation of plastids 

3 0 with DNA constructs comprising a viral single subunit 

RNA polymerase- specif ic promoter specific to the RNA 
polymerase expressed from the nuclear expression 
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constructs operably linked to DNA coding sequences of 
interest permits control of the plastid expression 
constructs in a tissue and/or developmental specific 
manner in plants comprising both the nuclear polymerase 
5 construct and the plastid expression constructs. 
Expression of the nuclear RNA polymerase coding 
sequence can be placed under the control of either a 
constitutive promoter, or a tissue- or developmental 
stage-specific promoter, thereby extending this control 

10 to the plastid expression construct responsive to the 

plastid- targeted, nuclear-encoded viral RNA polymerase. 
The introduced DNA coding sequence can be a single 
encoding region, or may contain a number of consecutive 
encoding sequences to be expressed as an engineered or 

15 synthetic operon. The latter is especially attractive 
where, as in the present invention, it is desired to 
introduce multigene biochemical pathways into plastids . 
This approach is not practical using standard nuclear 
transformation techniques since each gene introduced 

2 0 therein must be engineered as a monocistron, including 
an encoded transit peptide and appropriate promoter and 
terminator signals. Individual gene expression levels 
may vary widely among different cistrons, thereby 
possibly adversely affecting the overall biosynthetic 

25 process. This can be avoided by the chloroplast 
transformation approach. 
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Production of Transgenic Plants Comprising Genes for 
PHA Biosynthesis 

Plant transformation vectors capable of delivering 
DNAs (genomic DNAs , plasmid DNAs , cDNAs , or synthetic 
5 DNAs) encoding PHA biosynthetic enzymes and other 
enzymes for optimizing substrate pools for PHA 
biosynthesis as discussed in Examples 1-7 herein can be 
easily designed. Various strategies can be employed to 
introduce these encoding DNAs to produce transgenic 
10 plants capable of biosynthesizing high levels of PHAs, 
including : 

1. Transforming individual plants with an 
encoding DNA of interest . Two or more transgenic 
plants, each containing one of these DNAs , can then be 
15 grown and cross -pollinated so as to produce hybrid 

plants containing the two DNAs . The hybrid can then be 
crossed with the remaining transgenic plants in order 
to obtain a hybrid plant containing all DNAs of 
interest within its genome. 

20 2. Sequentially transforming plants with plasmids 

containing each of the encoding DNAs of interest, 
respectively . 

3. Simultaneously cotransf orming plants with 
plasmids containing each of the encoding DNAs, 

2 5 respectively. 

4. Transforming plants with a single plasmid 
containing two or more encoding DNAs of interest. 
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5. Transforming plants by a combination of any of 
the foregoing techniques in order to obtain a plant 
that expresses a desired combination of encoding DNAs 
of interest. 

5 

Traditional breeding of transformed plants 
produced according to any one of the foregoing methods 
by successive rounds of crossing can then be carried 
out to incorporate all the desired encoding DNAs in a 

10 single homozygous plant line (Nawrath et al . , 1994; PCT 
International Publication WO 93/02187) . Similar 
strategies can be employed to produce bacterial host 
cells engineered for optimal PHA production. 

In methods 2 and 3, the use of vectors containing 

15 different selectable marker genes to facilitate 

selection of plants containing two or more different 
encoding DNAs is advantageous . Examples of useful 
selectable marker genes include those conferring 
resistance to kanamycin, hygromycin, sulphonamides , 

2 0 glyphosate, bialaphos, and phosphinothricin . 

Stability of Transcrene Expression 

As several overexpressed enzymes may be required 
to produce optimal levels of substrates for copolymer 
formation, the phenomenon of co-suppression may 
25 influence transgene expression in transformed plants. 
Several strategies can be employed to avoid this 
potential problem (Finnegan and McElroy, 1994) . 



One commonly employed approach is to select and/or 
screen for transgenic plants that contain a single 
3 0 intact copy of the transgene or other encoding DNA 
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(Assaad et al . , 1993; Vaucheret, 1993; McElroy and 
Brettell, 1994) . Agrobac ter ium- mediated transformation 
technologies are preferred in this regard. 

Inclusion of nuclear scaffold or matrix attachment 
5 regions (MAR) flanking a transgene has been shown to 
increase the level and reduce the variability 
associated with transgene expression in plants (Stief 
et al., 1989; Breyne et al . , 1992; Allen et al . , 1993; 
Mlynarova et al . , 1994; Spiker and Thompson, 1996). 

10 Flanking a transgene or other encoding DNA with MAR 
elements may overcome problems associated with 
differential base composition between such transgenes 
or encoding DNAs and integrations sites, and/or the 
detrimental effects of sequences adjacent to transgene 

15 integration sites. 

The use of enhancers from tissue- specif ic or 
developmentally-regulated genes may ensure that 
expression of a linked transgene or other encoding DNA 
occurs in the appropriately regulated manner. 

2 0 The use of different combinations of promoters, 

plastid targeting sequences, and selectable markers for 
introduced transgenes or other encoding DNAs can avoid 
potential problems due to trans- inactivat ion in cases 
where pyramiding of different transgenes within a 

25 single plant is desired. 

Finally, inactivation by co- suppression can be 
avoided by screening a number of independent transgenic 
plants to identify those that consistently overexpress 
particular introduced encoding DNAs (Register et al . , 

30 1994) . Site-specific recombination in which the 

endogenous copy of a gene is replaced by the same gene, 
but with altered expression characteristics, should 
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obviate this problem (Yoder and Goldsbrough, 1994) . 

Any of the foregoing methods, alone or in 
combination, can be employed in order to insure the 
stability of transgene expression in transgenic plants 
5 of the present invention. 

Cloning of plastid pyruvate dehydrogenase complex and 
branched chain oxoacid dehydrogenase complex subunits 
and components 

The present invention provides nucleotide 
10 sequences that encode the Ela and Elf3 subunits, and the 
E2 component, of the plastid pyruvate dehydrogenase 
complex, as well as the Ela and El(3 subunits, and the 
E2 component, of the branched chain oxoacid 
dehydrogenase complex, of Arabidopsis tha.lia.na.. These 
15 sequences can be cloned by any appropriate method known 
in the art. For example, cDNA clones of known 
components of similar enzymes from other species can be 
utilized to screen a cDNA library from which the cDNA 
for the enzyme component is desired. Sources from 

2 0 which the plastid PDC Ela and El(3 cDNAs can be obtained 

include the analogous enzyme- encoding cDNAs from the 
red alga Porphyra purpurea; for the E2 component of the 
plastid pyruvate dehydrogenase, the analogous enzyme 
gene from the cyanobacterium Synechocystis sp . can be 
25 used. The cDNA for the Ela of a BCOADC can be isolated 
by identifying cDNAs which have significant homology to 
analogous tomato, human and bovine BCOADC Ela 
sequences. Similarly, the El(3 and the E2 components of 
a BCOADC can be isolated by comparing the similarity of 

3 0 candidate sequences with the human and bovine BCOADC 
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El(3 and E2 components, respectively. A cDNA library 
for the isolation of these components can be an 
expressed sequence tag library, for example one 
comprising cDNA from Arabidopsis thaliana. 
5 The cloned cDNAs for the plastid PDC and the 

BCOADC components can be sequenced in order to 
determine the nucleotide sequence and deduce the amino 
acid sequence for these enzymes . The sequences of 
these cDNAs can be determined by any method known in 
10 the art. Methods for the determination of various 
portions of the sequenced cDNA, such as a plastid 
targeting sequence, are also well known in the art. 

Engineering plants to produce propionyl-CoA in plastids 

The production of the P (3HB-co-3HV) precursor 
15 propionyl-CoA in plastids requires the presence of two 
elements which are not present, or which are present at 
very low levels, in the plastids of wild- type plants: 
2-oxobutyrate, and enzymes which will convert 2- 
oxobutyrate into propionyl-CoA. 
20 As noted above, Gruys et al . (1998) discusses 

several methods for the production of 2-oxobutyrate in 
plastids. These include: 

- -Overexpression of threonine deaminase; 
- -Overexpression of aspartate kinase and threonine 
2 5 deaminase; and 

--Overexpression of aspartate kinase, homoserine 
dehydrogenase, and threonine deaminase. 

The overexpression of these enzymes can be 
accomplished through the transformation into plants of 
30 nucleotide sequences encoding these enzymes, operably 
linked to a plant promoter, such as the cauliflower 



43 UMO 1482.1 

PATENT 

mosaic virus (CaMV) 35s promoter, or any other promoter 
known in the art which causes overexpression of such 
enzymes in plants . 

The expression of these and other enzymes in 
5 plastids can be achieved in at least two ways: 

1. By transforming coding sequences for these 
enzymes directly into the plastid genome in such a way 
that they are incorporated into" the plastid genome. 

10 Constructs and methods for stably transforming plastids 
of higher plants are well known in the art (for 
example, Svab et al . , 1990; Svab et al . , 1993; Staub et 
al., 1993; Maliga et al . , U.S. Patent No. 5,451,513; 
PCT International Publications WO 95/16783, WO 

15 95/24492, and WO 95/24493). These methods generally 
rely on particle gun delivery of DNA containing a 
selectable marker in addition to introduced DNA 
sequences for expression, and targeting of the DNA to 
the plastid genome through homologous recombination. 

20 2 . By creating a plant transformation vector 

comprising a coding sequence for the enzyme operably 
linked to a plastid targeting sequence, then 
transforming this vector into the plant. All of the 
enzymes discussed herein can be modified for plastid 

2 5 targeting by employing plant cell nuclear 

transformation constructs wherein DNA coding sequences 
of interest are fused to any of the available targeting 
peptide sequences capable of facilitating transport of 
the encoded enzymes into plant plastids, and driving 

3 0 expression by employing an appropriate promoter. 

Examples of plastid targeting peptides are provided in 
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Table 1 and in von Heijne et al . (1991) . The sequences 
that encode a targeting peptide region can be obtained, 
for example, from plant nuclear- encoded plastid 
proteins, such as the small subunit (SSU) of ribulose 
5 bisphosphate carboxylase, plant fatty acid 

biosynthesis related genes including acyl carrier 
protein (ACP) , stearoyl-ACP desaturase, (3-ketoacyl-ACP 
synthase and acyl-ACP thioesterase, or LHCPII genes. 
The encoding sequence for a targeting peptide effective 

10 in transport to plastids can include all or a portion 
of the encoding sequence for a particular targeting 
peptide, and can also contain portions of the mature 
protein encoding sequence associated with a particular 
targeting peptide. Numerous examples of targeting 

15 peptides that can be used to deliver target proteins 
into plastids exist, and the particular targeting 
peptide encoding sequences useful in the present 
invention are not critical as long as delivery into a 
plastid is obtained. Proteolytic processing within the 

2 0 plastid then produces the mature enzyme. This 

technique has proven successful not only with enzymes 
involved in PHA synthesis (Nawrath et al . , 1994), but 
also with neomycin phosphotransferase II (NPT-II) and 
CP4 EPSPS (Padgette et al . , 1995), for example. 
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Table 1. Examples of plastid proteins from various 

species with known plastid targeting sequences 
that can be used to target proteins to 
plastids 



Chloroplast Targeting Peptides 



Arabidopsis thai i ana: 

5 - enolpyruvyl - shikimate- 3 -phosphate synthase 
Rubisco activase 
Rubisco small subunit 
Tryptophan synthase 

Brassica napus : 

Acyl carrier protein 
Plastid chaperonin-60 

Pi sum sativum: 

Carbonic anhydrase 
Chloroplast stromal HSP7 0 
Glutamine synthetase 
Rubisco small subunit 



Reference: von Heijne, G. ; Hirai, T. ; Klosgen, R.B.; 
Steppuhn, J.; Bruce, B . ; Keegstra, K. ; Herrmann, R. 
(1991) CHLPEP-A database of chloroplast transit peptides. 
Plant Molecular Biology Reporter 9:104-126. 



46 UMO 1482.1 

PATENT 

Engineering plants to produce poly (3 -hvdro xvbutvrate- 
3 -hydroxy-valerate) copolymer 

Plants which produce P (3HB-co-3HV) can be created 
by engineering them to produce 2 -oxobutyrate , to 
5 convert 

2 -oxobutyrate to propionyl -CoA, and to synthesize 
P (3HB-co-3HV) from propionyl-CoA and acetyl-CoA. 
Methods for producing plants which synthesize 2- 
oxobutyrate are discussed above. Such plants can be 
10 modified to convert 

2-oxobutyrate to propionyl-CoA in the manner discussed 
below. 

The nucleotide sequences of the BCOADC Ela and El(3 
subunits, and that of the E2 component, are provided 
15 herein as a means to effect the conversion of 

2-oxobutyrate to propionyl -CoA in plastids containing 
the 

2-oxobutyrate substrate. It is not necessary to 
provide the E3 component since the E3 components of all 
2 0 of the 

a-ketoacid dehydrogenase complexes are probably 
interchangeable. The E3 subunit already present in the 
plastid PDC thus almost certainly functions with 
plastid-targeted BCOADC subunits . The nucleotide 

2 5 sequences of the plastid PDC Ela and El(3 subunits, and 
the E2 component, provide sources of plastid targeting 
sequences. These plastid PDC sequences can also be 
genetically manipulated to enhance their ability to 
convert 2-oxobutyrate to propionyl -CoA, as suggested by 

30 Gruys et al . (1998) . 

The nucleotide sequences encoding the BCOADC Ela 
and El(3 subunits, and the E2 component, can be directly 
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transformed into the plastid genome by the methods 
discussed above. Alternatively, the BCOADC El and E2 
nucleotide sequences can be transformed into the plant 
nuclear genome, wherein the enzyme coding sequences are 
5 operably linked to a plastid targeting sequence by 
methods known in the art . See Example 7 . Useful 
plastid targeting sequences include those from the 
plastid PDC. These targeting sequences from 
Arabidopsis thai i ana are disclosed in Examples 1 and 2, 
10 below. 

As another alternative for utilizing a BCOADC for 
the conversion of 2 -oxobutyrate to propionyl-CoA in 
plastids, a nucleotide sequence encoding the BCOADC El(3 
subunit can be engineered to utilize the PDC E2 
15 component which is already present in the plastids. 

The BCOADC El(3 subunit can be modified such that the 
native E2 binding region thereof is replaced with the 
E2 binding region of the plastid PDC El(3 subunit. The 
nucleotide sequences encoding the modified BCOADC El(3 

2 0 subunit and the BCOADC Elcx subunit can also be operably 

linked to a plastid targeting sequence. The modified 
nucleotide sequences for these two subunits (a and (3) 
of the BCOADC El component can then be inserted into 
plants by standard plant transformation methods, where 
25 they are translated in the cytoplasm. The enzymes are 
then transported to the plastid where they combine with 
the plastid PDC E2 and E3 components, and catalyze the 
conversion of 2 -oxobutyrate to propionyl-CoA. See 
Example 6 below. 

3 0 The conversion of propionyl -CoA and acetyl-CoA to 

P (3HB-co-3HV) requires a |3-ketothiolase, a |3-ketoacyl- 
CoA reductase, and a PHA synthase. Nucleotide 
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sequences encoding these enzymes can be incorporated 
into the plastid genome directly, or into the nuclear 
genome, with operably linked plastid targeting sequences, 
utilizing the same well-known methods as previously 
discussed. Preferred (3 -ketothiolases are BktB and pAE65 
from A. eutrophus, and Zoogloea ramigera (3-ketothiolases "A" 
and "B", as disclosed in Gruys et al (1998). Preferred p- 
ketoacyl-CoA reductases and PHA synthases include those from 
A. eutrophus, encoded by the phbB and phbC genes, 
respectively. However, the use of other 3 -ketothiolases 
which are able to utilize propionyl-CoA, and the use of 
other (3-ketoacyl-CoA reductases and PHA synthases are within 
the scope of this invention. Included are those enzymes 
from, for example, Alcaligenes fa.eca.lis, Aphanothece sp., 
A^otobacter vinelandii , Bacillus cereus, Bacillus 
megaterium, Beij erinkia indica, Derxia gummosa, 
Methylobacterium sp . , Microcoleus sp., Nocardia corallina, 
Pseudomonas cepacia, Pseudomonas ejctorquens, Pseudomonas 
oleovorans , Rhodobacter sphaeroides, Rhodobacter capsulatus, 
Rhodospirillum rubrum, and Thiocapsa pfennig ii . 

P (3HB-CO-3HV) Copolymer Composition 

The P (3HB-CO-3HV) copolymers of the present 
invention can comprise about 75-99% 3HB and about 1-2 5% 
3HV based on the total weight of the polymer. More 

25 preferably, P (3HB-co-3HV) copolymers of the present 
invention comprise about 85-99% 3HB and about 1-15% 
3HV. Even more preferably, such copolymers comprise 
about 90-99% 3HB and about 1-10% 3HV. P (3HB-co-3HV) 
copolymers comprising about 4%, about 8%, and about 12% 

3 0 3HV on a weight basis possess properties that have made 
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them commercially attractive for particular 
applications. One skilled in the art can modify 
P (3HB-CO-3HV) copolymers of the present invention by 
physical or chemical means to produce copolymer 
5 derivatives having desirable properties different from 
those of the plant -produced copolymer. 

Optimization of P (3HB-CO-3HV) copolymer production 
by the methods discussed herein is expected to result 
in yields of copolymer in the range of from at least 
10 about 1% to at least about 20% of the fresh weight of 
the plant tissue, organ, or structure in which it is 
produced . 

The following examples illustrate the invention, 
but are not to be taken as limiting the various aspects 
15 of the invention so illustrated. 

Conventional methods of gene isolation, molecular 
cloning, vector construction, etc., are well known in 
the art and are summarized in Sambrook et al . , 1989, 
and Ausubel et al . , 1989 and 1994. One skilled in the 

2 0 art can readily repeat the methods and reproduce the 

compositions described herein without undue 
experimentation. The various DNA sequences, fragments, 
etc., necessary for this purpose can be readily 
obtained as components of commercially available 
25 plasmids, or synthesized by well known methods, or are 
otherwise well known in the art and publicly available. 

Example 1 

Cloning and Sequencing cDNA Encoding 

3 0 the Ela and E1B Subunits of the Arabidovsis thaliana. 

Plastid Pyruvate Dehydrogenase Complex 
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Expressed sequence tag (EST) clones (Reith et al . , 
1995) from the Arabidopsis Biological Resource Center 
(ABRC) at Ohio State University were used to isolate 
full-length cDNAs for both the plastid Ela and El (3 
5 subunits from an A. thaliana cDNA library. Two clones 
(GenBank accessions T75600 and N65566) were identified 
as potentially encoding the plastid Ela and El(3 
subunits as follows. 

Oligonucleotides were designed based on sequences 
10 common to P. purpurea odpA and odpB and the two 

Arabidopsis EST sequences and synthesized (all recited 
in the 5 '-3' direction): 

Ela: 5' primer, CGGTACtCAAGTCTGACTCTGTCGTT (SEQ ID 
NO : 7 ) ; 

15 3' primer, CCTTCGAuAGGTTCCATCTCCGAAAAA (SEQ ID NO: 8); 
El(3: 5' primer, CGGTACtCTTCGAGGCTCTTCAGGAA (SEQ ID 
NO : 9 ) ; 

3' primer, CCTTCGAuACGGGCCTTAGACCAGT (SEQ ID NO: 10) . 
The symbols denote restriction sites (t: Kpn I, and u: 

2 0 Hind III) added for subcloning. Thermal cycling was 

used to amplify cDNA fragments from A. thaliana using 
first strand cDNA. Thermal cycling reactions (50 /zl 
total volume) contained 10 mM Tris-HCl, pH 7.9, 1.25 mM 
MgCl 2 , 2 5 juM dNTPs, 5 units Taq polymerase (Promega, 
25 Madison, WI) , 2 /xg A. thaliana first strand cDNA, and 
10 ng of each primer. Thermal cycling was performed 
with a Perkin-Elmer model 480, with rapid ramp times 
set at l°C/s. Cycling conditions were 94 °C for 2 0 s, 
50°C for 30 s, 72°C for 2 min with 6 s extensions each 

3 0 cycle and 3 0 rounds of cycling. Under these 

conditions, products containing 288 base pairs (Ela) 
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and 215 base pairs (Elj3) were obtained. The products 
were subcloned into pGEMT (Promega, Madison, WI) and 
sequenced to confirm their identity. Thermal cycling 
was also used to generate probes radiolabelled with 
5 (a 32 P) -dCTP, using reaction mixtures identical to those 
previously described except for a 1000-fold reduction 
in the concentration of non- radioactive dCTP. Before 
use, the probes were desalted using Sephadex G-5 0 
columns to remove unincorporated nucleotides. An 

10 Arabidopsis cDNA library (X-PRL2 , obtained from the 

ABRC) was plated at a density of 2.25xl0 4 plaques per 
plate for a total of 2.25x10 s plaques. BioTrace NT nylon 
filters (Gelman, Ann Arbor, MI) were used for plaque- 
lifts and were processed according to the 

15 manufacturer's specifications. Hybridizations were 

performed according to Current Protocols in Molecular 
Biology (Ausubel et al . , 1994). After three rounds of 
screening, 7 potential Ela and 12 potential Elp cDNA 
clones were isolated, ranging in size from 1100 to 1550 

20 base pairs. Plaque-purified X phage were treated 

according to the manufacturer's instructions (Gibco 
BRL, Gaithersburg, MD) in order to excise the pZL-1 
recombinant clones. 

DNA sequencing was performed using an ABI prism 

25 Model 377 sequencer, and analyzed using IntelliGenetics 
GeneWorks DNA analysis program version 2.5 on a 
Macintosh computer. Dye-deoxy terminating cycle 
sequencing reactions were carried out on both strands 
of full-length cDNA inserts and deletion fragments 

3 0 derived therefrom. 

DNA isolation and Northern and Southern blotting 
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were carried out according to Current Protocols in 
Molecular Biology (Sections 2.9.1, 4.3.1 and 4.9.1; 
Ausubel et al . , 1994). RNA isolation was accomplished 
with the RNAgents total RNA isolation kit (Promega, 
5 Madison, WI) . Northern blot prehybridization (3 h) , 

hybridization (12 h) , and 4 washes were done with 2.5 X 
SSPE (IX = 0.15 mM NaCl, 0.02 raM Na 2 P0 4 , 2 jUM EDTA, pH 
7.4), 1% SDS, 1% non-fat dry milk, and 250 ptg/ml salmon 
sperm DNA at 68°C. Blots were exposed on Kodak X- 

10 OMAT/AR film (Rochester, New York) at -70°C with an 
intensifying screen. 

Among the genes present in the P. purpurea 
plastome are two open reading frames, odpA and odpB, 
encoding proteins 32% identical to the Arabidopsis 

15 mitochondrial Ela and El(3 subunits (Grof et al . , 1995; 

Leuthy et al . , 1994; Leuthy et al . , 1995). Attempts to 
use cloned mitochondrial PDC cDNAs as probes to 
identify plastid sequences have been unsuccessful. 
Based upon the odpA and odpB sequences, two EST clones 

2 0 (accessions T756 0 0 and N65566) which appear to encode 

proteins more highly related to the P. purpurea odpA 
and odpB sequences than to the Arabidopsis 
mitochondrial sequences were used to isolate two cDNAs 
as potential Ela and Elp. clones. 
25 The nucleotide sequence of the Arabidopsis plastid 

PDC Ela cDNA (Genbank Accession No. U8 0185) is shown in 
Appendix A and as SEQ ID N0:1. Ela cDNA (153 0 bp) has 
a 106 bp 5' untranslated region, a 1284 bp open reading 
frame encoding a polypeptide of 42 8 amino acids 

3 0 (Appendix B and SEQ ID NO : 2 ) , and a 14 0 bp 3' 

untranslated region. The nucleotide sequence of the 
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Arabidopsis plastid PDH El(3 cDNA (Genbank Accession No. 
U80186) is shown in Appendix C and as SEQ ID NO : 3 . The 
Elp cDNA (1441 bp) has a 6 bp 5' untranslated region, a 
1218 bp open reading frame encoding a polypeptide of 
406 amino acids (Appendix D and SEQ ID NO:4), and a 217 
bp 3' untranslated region. The calculated molecular 
weight and isoelectric point values for the Ela and El(3 
polypeptides encoded by the open reading frames are 
47,120 with a pi of 7.25, and 44,208 with a pi of 5.89, 
respectively. The deduced amino acid sequence for Ela 
has 61%, and El (3 68%, identity with P. purpurea odpA 
and odpB, respectively. 

The first 68 residues of Ela and the first 73 
residues of El(3 exhibit characteristics of chloroplast 
targeting peptides but not those of mitochondrial 
targeting sequences (Gavel et al . , 1990; von Heijne et 
al . , 1989) . To determine structural motifs of the 
targeting peptides, we used the GeneWorks 
(IntelliGenetics , Mountain View, CA) protein algorithm 
to identify possible a-helix and (3-strands. Both 
plastid Ela and Eip have the potential to form 
amphiphilic (3 -strands consistent with plastid targeting 
sequences, but did not fit the amphiphilic a-helix 
which is characteristic of mitochondrial targeting 
sequences . 

Tables 2 and 3 show the alignment of the deduced 
amino acid sequences of PDH Ela and El(3 . Abbreviations 
are the same as in Fig 7. * indicates conserved, • non- 
conserved phosphorylation sites. ° indicates the 
conserved Cys 62 of the mature H.s. Ela sequence. 

Overall, there is 2 8% sequence identity between 
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Arabidopsis plastid PDC Ela and its mammalian 
counterparts. However, in specific regions, the degree 
of sequence conservation is much higher. The PDH 
component of PDC requires thiamine pyrophosphate (TPP) 
5 as a cof actor for decarboxylation of pyruvate (Patel et 
al., 1990). It has been reported that TPP binds to the 
Ela subunit of mammalian PDH at a site containing a 
structural motif common to pyrophosphate -binding 
enzymes (Reed, 1974) . A similar motif (50% identity 
10 with the bovine Ela TPP-binding domain) is found in the 
A. thaliana plastid Ela sequence at residues 160-213 
(Table 2) . 

A highly conserved Cys residue (Cys 62 of mature 
human Ela, Table 2) has been identified in eukaryotic 
15 PDH Ela sequences, and it has been proposed that this 
Cys is an essential component of the enzyme's active 
site (Ali et al . , 1993). The A. thaliana plastid Ela 
sequence contains a similar motif, i.e. the same 
immediate flanking residues at 112-116, but the 

2 0 otherwise conserved Cys is replaced with a Val (Table 

2) . 

Mitochondrial PDCs are regulated in part by 
reversible phosphorylation of three conserved Ser 
residues in the Ela sequence by a specific, complex - 
25 associated PDH-kinase (Reed, 1974) . The Ser residues 
phosphorylated in mammalian mitochondrial PDH are also 
conserved in the plant mitochondrial (Luethy et al . , 

1995) , yeast (Behal et al . , 1989), and nematode 
(Johnson et al . , 1992) amino acid sequences. However, 

3 0 while the plant mitochondria PDC is reversibly 

phosphorylated (Randall et al . , 1989; Randall et al . , 

1996) , all evidence to date indicates that plastid PDC 
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activity is not regulated by phosphorylation (Camp et 
al., 1985). Despite this difference, the regulatory 
Ser residues and their flanking sequences are present 
in the plastid Ela sequence (Table 2) . Korotchkina and 
5 Patel (199 5) have reported the results from mutagenesis 
of these phosphorylation sites, and concluded that site 
one is closer to the active site or lies on the pathway 
to the main catalytic conformational change. This 
might explain why this region is so highly conserved. 

10 The amino acid-motif corresponding to phosphorylation 
site one in mitochondrial PDH sequences is present in 
the plastid polypeptide (Tyr 320-Pro 330 or Tyr 287-Pro 
2 97 in the H. s. sequence, Table 2) . Two of the four 
substitutions are by residues with conserved 

15 properties. The sequence of the plastid Ela 

corresponding to phosphorylation site two lacks a Ser 
and the region is dominated by five acidic and two 
basic residues (Asp 329 -Asp 339) . The Arabidopsis 
plastid Ela sequence contains a Ser at site 3 (Ala 259- 

20 Ala 267) , but the flanking residues are dissimilar to 

the mammalian site 3 (Table 2) . While two of the three 
Ser are in the appropriate positions, it is most likely 
then that plastid PDC is not regulated by 
phosphorylation due to the lack of plastid PDH-kinase 

25 (Camp et al . , 1985) . 

Wexler et al . (1991) compared alignments of three 
PDH and three branched- chain a-keto acid dehydrogenase 
sequences. Among El(3 sequences, four regions of 
sequence conservation were observed. Region one, the 

3 0 proposed E2 interaction site, is present in the 
Arabidopsis plastid PDH El(3 sequence (Table 3) . 
Conserved regions two and three share high homology 
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with other decarboxylating enzymes, suggesting a role 
in decarboxylation of pyruvate (Wexler et al . , 1991). 
A functional role has not yet been attributed to region 
four (Table 3). Eswaran et al . (1995) have described 
5 Arg 23 9 as being an essential residue near or at the 
active site of the bovine El|3. This residue is 
conserved throughout the eukaryotic PDH sequences 
(e.g., Arg 269 of H. s. sequence in Table 3), and is 
present in the A. thaliana plastid El(3 sequence at 

10 position 318. 

The genomic organization of Arabidopsis Ela and 
El(3 was determined by Southern blot analysis. An Ela 
cDNA probe hybridized to a single restriction fragment 
in each lane, suggesting one gene (Fig. 4A) . An El[3 

15 cDNA probe, on the other hand, hybridized to multiple 
fragments in a pattern consistent with the restriction 
digest of El(3 cDNA (data not shown) . The Xba I lane 
contained multiple hybridizing bands which could be due 
to a second gene or an intron containing an Xba I 

20 restriction site (Fig. 4B) . 

In order to evaluate expression of the A. thaliana 
plastid PDH genes, 10 fig total RNA obtained from young 
leaves were resolved by formaldehyde gel 
electrophoresis. Northern blot analyses confirmed the 

25 expression of a single mRNA species of 1.65 kb for Ela 
and 1.5 kb for El(3 (Figs. 5A and 5B) . 

The two cDNAs reported here have been identified 
as encoding plastid rather than mitochondrial proteins 
based on their high homology with the P. purpurea 

3 0 chloroplast genes, the presence of N- terminal sequences 
characteristic of plastid targeting peptides, and their 
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relatively low homology with plant mitochondrial El 
sub-units {Grof et al . , 1995; Leuthy et al . , 1994; 
Leuthy et al . , 1995). Assessments of the mature N- 
terminal sequences were based on homology with the 
5 mature odp and mitochondrial El sequences. 

The mature A. tha.lia.na. plastid Ela and El|3 amino 
acid sequence have the highest homology (68%) with the 
P. purpurea chloroplast odpA and odpB sequences, 
respectively, but only 31 and 32% identity with the 

10 respective A. thaliana mitochondrial El sequences 

(Tables 2 and 3) . The homology with other eukaryotic 
mitochondrial El sequences is lower yet. Additionally, 
a monoclonal antibody prepared against mitochondrial 
Ela does not recognize chloroplastic Ela (Luethy et 

15 al . , 1995) nor does the monoclonal antibody recognize 
the recombinant plastid Ela on immunoblots . 

Dendrogram analyses show that A. thaliana plastid 
El, P. purpurea chloroplast odp, and Synechocystis sp . 
(a cyanobacterium) pdh sequences segregate as a family 

2 0 distinct from mitochondrial and bacterial sequences 

(Figs. 6A and 6B) . A similar separation has also been 
shown for plastid and mitochondrial ribosomal RNA 
sequences (Palmer, 1992) . The A. thaliana plastid 
cDNAs and P. purpurea odp genes are the only sequences 

25 reported thus far for plastid forms of PDH. 

As additional cDNAs and genes for plastid and 
mitochondrial specific isozymes are determined, insight 
as to the lineage of plastid genes will be gained. 
Mitochondrial rRNA genes show convincing similarity to 

30 purple-photosynthetic bacterial rRNA sequences. In 
contrast, plastid rRNA has similarity with 
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cyanobacterial rRNA. This relationship between 
plastids and cyanobacteria has also been noted for 
genes encoding the transcriptional and translational 
apparatus (Palmer, 1992) . The new sequences reported 
5 here should contribute to understanding if the 

emergence of mitochondria and plastids was the result 
of single or multiple primary (i.e., 

eubacteria/eukaryotic) endosymbioses , or if secondary 
(i.e., eukaryotic/eukaryotic) endosymbioses led to the 

10 establishment of these organelles (Palmer, 1992) . 

Antibodies to the Ela subunit of the plastid 
pyruvate dehydrogenase complex were generated by 
inserting the gel purified BamHI to Hindi I I fragment of 
the cDNA for El at the BamHI (5 1 ) to Hindi I I (3') 

15 cloning site of pET2 8a (Novagen) . The recombinant 

clone was expressed, and the 5' end sequenced to ensure 
the correct reading frame. The recombinant protein was 
expressed using the above construct in E. coll strain 
BL21 (DE3) (Novagen). Growth conditions were as 

20 follows: A single colony was picked and cultured in 5 
mL LB + 150 micrograms ampicillin overnight at 37 C 
shaking at 2 00 rpm. The 5ml culture was used to 
inoculate 500 mL LB + 150 microgram ampicillin and was 
allowed to grow for 4 h. The culture was then induced 

2 5 using 0 . 1 mM IPTG and allowed to shake at 37 C for an 
additional 5 h. The culture was then centrifuged in a 
GSA rotor at 7,000 rpm to pellet cells. Cells were 
lysed in 6 M guanidinium HCl , 10 mM Tris pH 8.0 at room 
temperature. Cell debris was pelleted at 12,000 rpm in 

30 an SS-34 rotor for 20 min, and the recombinant protein 
was purified using Ni-NTA agarose. Rabbits were 
injected with 15 0 microgram of recombinant protein 
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mixed 1:1 with complete adjuvant. A 30 day boost was 
given with the same protein preparation, at the same 
concentration. Ten days after the boost, the antibody 
titer was determined to be 1:80,00 0 against pea 
5 chloroplast stromal extract by immunoblot procedures . 
It should be noted that the present invention 
encompasses not only the specific DNA sequences 
disclosed herein and the polypeptides encoded thereby, 
but also biologically functional equivalent nucleotide 

10 and amino acid sequences. The phrase "biologically 
functional equivalent nucleotide sequences" denotes 
DNAs and RNAs , including chromosomal DNA, plasmid DNA, 
cDNA, synthetic DNA, and mRNA nucleotide sequences, 
that encode polypeptides exhibiting the same or similar 

15 enzymatic activity as that of the enzyme polypeptides 
encoded by the sequences disclosed herein when assayed 
by standard enzymatic methods, or by complementation. 
Such biologically functional equivalent nucleotide 
sequences can encode polypeptides that contain a region 

2 0 or moiety exhibiting sequence similarity to the 

corresponding region or moiety of the present disclosed 
polypeptides . 

One can isolate polypeptides useful in the present 
invention from various organisms based on homology or 

2 5 sequence identity. Although particular embodiments of 
nucleotide sequences encoding the polypeptides 
disclosed herein are shown in the various SEQ IDs 
presented, it should be understood that other 
biologically functional equivalent forms of such 

30 polypeptide -encoding nucleic acids can be readily 
isolated using conventional DNA- DNA or DNA-RNA 
hybridization techniques. Thus, the present invention 
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also includes nucleotide sequences that hybridize to 
any of the nucleic acid SEQ IDs and their complementary 
sequences presented herein, and that code on expression 
for polypeptides exhibiting the same or similar 
5 enzymatic activity as that of the presently disclosed 
polypeptides. Such nucleotide sequences preferably 
hybridize to the nucleic acid sequences presented 
herein or their complementary sequences under moderate 
to high stringency (see Sambrook et al . , 1989). 

10 Exemplary conditions include initial hybridization in 
6X SSC, 5X Denhardt's solution, 100 fig /ml fish sperm 
DNA, 0.1% SDS, at 55°C for sufficient time to permit 
hybridization (e.g., several hours to overnight), 
followed by washing two times for 15 min each in 2X 

15 SSC, 0.1% SDS, at room temperature, and two times for 

15 min each in 0.5-1X SSC, 0.1% SDS, at 55°C, followed 
by autoradiography. Typically, the nucleic acid 
molecule is capable of hybridizing when the 
hybridization mixture is washed at least one time in 

20 0.1X SSC at 55°C, preferably at 60°C, and more 
preferably at 65 °C. 

The present invention also encompasses nucleotide 
sequences that hybridize under salt and temperature 
conditions equivalent to those described above to 

2 5 genomic DNA, plasmid DNA, cDNA, or synthetic DNA 

molecules that encode the same amino acid sequences as 
these nucleotide sequences, and genetically degenerate 
forms thereof due to the degenerancy of the genetic 
code, and that code on expression for a polypeptide 

3 0 that has the same or similar enzymatic activity as that 

of the polypeptides disclosed herein. 
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Biologically functional equivalent nucleotide 
sequences of the present invention also include 
nucleotide sequences that encode conservative amino 
acid changes within the amino acid sequences of the 
5 present polypeptides, producing silent changes therein. 
Such nucleotide sequences thus contain corresponding 
base substitutions based upon the genetic code compared 
to the nucleotide sequences encoding the present 
polypeptides. Substitutes for an amino acid within the 

10 fundamental polypeptide amino acid sequences discussed 
herein can be selected from other members of the class 
to which the naturally occurring amino acid belongs. 
Amino acids can be divided into the following four 
groups: (1) acidic amino acids; (2) basic amino acids; 

15 (3) neutral polar amino acids; and (4) neutral 

non-polar amino acids . Representative amino acids 
within these various groups include, but are not 
limited to: (1) acidic (negatively charged) amino acids 
such as aspartic acid and glutamic acid; (2) basic 

2 0 (positively charged) amino acids such as arginine, 

histidine, and lysine; (3) neutral polar amino acids 
such as glycine, serine, threonine, cyteine, cystine, 
tyrosine, asparagine, and glutamine; and (4) neutral 
nonpolar (hydrophobic) amino acids such as alanine, 
25 leucine, isoleucine, valine, proline, phenylalanine, 
tryptophan, and methionine. 

Conservative amino acid changes within the present 
polypeptide sequences can be made by substituting one 
amino acid within one of these groups with another 

3 0 amino acid within the same group. The encoding 

nucleotide sequences (gene, plasmid DNA, cDNA, 
synthetic DNA, or mRNA) will thus have corresponding 



62 UMO 1482.1 

PATENT 

base substitutions , permitting them to code on 
expression for the biologically functional equivalent 
forms of the present polypeptides. 

Useful biologically functional equivalent forms of 
5 the DNA sequences disclosed herein include DNAs 

comprising nucleotide sequences that exhibit a level of 
sequence identity to corresponding regions or moieties 
of these DNA sequences from 40% sequence identity, or 
from 60% sequence identity, or from 80% sequence 

10 identity, to 10 0% sequence identity to the DNAs 
encoding the presently disclosed polypeptides. 
However, regardless of the percent sequence identity of 
these nucleotide sequences, the encoded proteins would 
possess the same or similar enzymatic activity as the 

15 present polypeptides. Thus, biologically functional 
equivalent nucleotide sequences encompassed by the 
present invention include sequences having less than 
40% sequence identity to any of the nucleic acid 
sequences presented herein, so long as they encode 

2 0 polypeptides having the same or similar enzymatic 
activity as the polypeptides disclosed herein. 

Sequence identity can be determined using the 
"BestFit" or "Gap" programs of the Sequence Analysis 
Software Package, Genetics Computer Group, Inc., 

2 5 University of Wisconsin Biotechnology Center, Madison, 

WI 53711. 

Due to the degeneracy of the genetic code, i.e., 
the existence of more than one codon for most of the 
amino acids naturally occuring in proteins, genetically 

3 0 degenerate DNA (and RNA) sequences that contain the 

same essential genetic information as the DNA sequences 
disclosed herein, and which encode the same amino acid 
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sequences as these DNA sequences, are encompassed by 
the present invention. Genetically degenerate forms of 
any of the other nucleic acid sequences discussed 
herein are encompassed by the present invention as 
5 well . 

The alternative nucleotide sequences described 
above are considered to possess a biological function 
substantially equivalent to that of the 
polypeptide -encoding DNAs of the present invention if 

10 they encode polypeptides having enzymatic activity 

differing from that of any of the present polypeptides 
by about 30% or less, preferably by about 20% or less, 
and more preferably by about 10% or less when assayed 
in vivo by complementation or in vitro by the standard 

15 enzymatic assays. 

Example 2 
Cloning and Sequencing of a cDNA 
Encoding the Arabidopsis thaliana 
Dihydrolipoamide S-acetvl transferase (E2) Component 
2 0 of the Plastid Pyruvate Dehydrogenase Complex 

A search of the Arabidopsis expressed sequence 
tagged (EST) database identified one Arabidopsis thaliana 
EST clone which has significant homology to the 
(cyanobacterial) Synechocystis sp . dihydrolipoamide 
25 acetyltransf erase subunit, GenBank accession D90915. The 
Arabidopsis EST clone (GenBank accession W43179) was 
obtained from the Arabidopsis Biological Resource Center 
(ABRC) at Ohio State University, then used to screen an 
Arabidopsis APRL2 cDNA library (ABRC) for a full length 
30 clone as in Example 1. Two (approximately 1700 bp) 
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clones assessed as full length, were identified and 
sequenced as in Example 1 . 

The plastid PDC E2 clone is 1709 bp in length (SEQ 
ID NO: 5; GenBank accession AF066079) with a continuous 
5 open reading frame of 1440 bp encoding a protein of 480 
amino acids (SEQ ID NO: 6), with a deduced molecular mass 
of 52,400 daltons. The mature portion of the E2 
component, without the chloroplast targeting peptide (see 
below), has a deduced molecular mass of 44,900 daltons. 

10 When subjected to SDS-PAGE electrophoresis, the full 

length and the mature plastid PDC E2 proteins ran slower 
than a globular protein of the same mass. These proteins 
appeared on SDS-PAGE to have molecular masses of 69,000 
and 62,00 0, respectively. This slow migration on SDS- 

15 PAGE electrophoresis is consistent with the 

electrophoretic behavior of mitochondrial E2 components 
(Guest et al . , 1985) . 

The mature part of the cDNA clone (coding for the 
catalytic region of the protein) was expressed in E . coli 

2 0 using the pET2 8c expression vector (Novagen, Madison, 
WI) . The recombinant protein (which includes a C- 
terminal six histidine tag) was purified under denaturing 
conditions by Ni-NTA affinity chromatography according to 
the manufacturer's instructions (Qiagen Inc., Chatsworth, 

2 5 CA) . Polyclonal antibodies were raised to the 

recombinant protein in New Zealand White rabbits . These 
antibodies recognize the recombinant protein at a high 
dilution (1:100,000). In a analysis of an extract of 
purified pea chloroplasts , these antibodies recognized 

30 two proteins. One protein electrophoretically migrated 
at an apparent mass of 62,000, identical to the 
electrophoretic behavior of the mature plastid PDC E2 
component. The other protein which was recognized by the 



65 UMO 1482.1 

PATENT 

anti-E2 antibodies had an electrophoretic mobility with 
an apparent mass of 76,000 daltons. This larger protein 
is likely due to mitochondrial contamination, since its 
apparent mass is equivalent to the mitochondrial E2 
5 component . 

The cDNAs for the Arabidopsis thaliana plastid Ela, 
Elp, and E2 were transcribed and translated in vitro 
using the TnT™ transcription/translation system (Promega, 
Madison, WI) with the plasmid pZLl (Life Technologies, 

10 Inc.) and the T7 promoter. Presenting the product to 
isolated pea chloroplasts resulted in ATP-dependent 
import into the plastid in a manner that protects it from 
protease action. This establishes that the cDNA 
sequences encode plastid targeting sequences. These 

15 targeting sequences are assessed to be the first 6 8 amino 
acids of the Ela subunit (Appendix B and SEQ ID NO: 2) , 
the first 73 amino acids of the E1S subunit (Appendix D 
and SEQ ID NO: 4) , and the first 54 amino acids of the E2 
component (SEQ ID NO: 6) . 

2 0 Example 3 

Cloning and Sequencing of cDNA 
Encoding the Arabidovsis thaliana Ela Subunit 
of the Branched-Chain Oxoacid Dehydrogenase Complex 

Selection of an A. thaliana expressed sequence 
25 tagged (EST) cDNA clone (Newman et al . , 1994) was 

accomplished by searching the Arabidopsis EST database 
using the BLASTP program of the National Center for 
Biotechnology Information. One EST cDNA clone (GenBank 
accession N96041) was found to have significant homology 

3 0 to the tomato, human, and bovine BCOADC Ela subunit s, 

making it a candidate for the A. thaliana Ela. This cDNA 
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clone was obtained from the Arabidopsis Biological 
Resource Center at the Ohio State University. The clone 
was sequenced completely on both strands by subcloning 
restriction enzyme fragments of the clone and using two 
5 specific oligonucleotide primers designed from previously 
sequenced stretches. Sequencing was conducted by the DNA 
core facility at the University of Missouri, Columbia, MO 
on an ABI 377 instrument. The BCOADC Ela cDNA clone is 
1587 bp, with a 3' untranslated region of 165 bp 

10 (Appendix E and SEQ ID NO: 11) . The open reading frame 

encodes a protein of 4 72 amino acids (Appendix F and SEQ 
ID NO: 12) with a deduced molecular mass of 53,3 63 
daltons. We have not identified an initiating 
methionine/start codon, but alignment with the tomato, 

15 bovine, human and mouse sequences shows the clone is 
considerably longer than the mature coding region of 
these proteins . 

The deduced amino acid sequence of the clone has 
significant homology to BCOADC Ela sequences in the 

20 database: 56.8% identity with the tomato, 42% with the 
human, 40.7% with the bovine, and 41.6% with the mouse 
Ela amino acid sequences. Though an initiating 
methionine was not identified, the N- terminus has 
properties similar to a mitochondrial targeting peptide. 

25 The PSORT program (prediction of protein intracellular 

localization sites) suggests the mitochondrial matrix as 
the most probable destination of the A. thaliana Ela 
protein. However, the amino acid sequence also contains 
an SKL motif close to the C-terminus which is indicative 

3 0 of peroxisomal localization, and this is the second most 
probable localization site determined by the PSORT 
program. 
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Ser 366 of the A. thaliana amino acid sequence is at a 
position which is conserved in all the above sequences. 
This site is a designated phosphorylation site for the 
mouse and bovine sequences. However, the second 
5 conserved Ser phosphorylation site in the animal 

sequences is replaced by a Pro in the tomato sequence and 
an Ala in the A. thaliana sequence (Appendix F and SEQ ID 
NO: 12) . 

Example 4 

10 Cloning and Sequencing of cDNA 

Encoding the Arabidovsis thaliana E1B Subunit 
of the Branched- Chain Oxoacid Dehydrogenase Complex 

Selection of Arabidopsis thaliana expressed sequence 
tagged (EST) clones (Newman et al . , 1994) was 

15 accomplished by searching the Arabidopsis EST database 
using the BLASTP PROGRAM of the National Center for 
Biotechnology Information. Two EST clones were found to 
have significant homology to the human and bovine 
branched- chain oxoacid dehydrogenase (BCOADC) El(3 

20 subunit. These two clones (GenBank accessions T04217 and 
H37020) were identified as potentially encoding the 
Arabidopsis thaliana BCOADC El(3 subunits. We obtained 
these partial EST clones from the Arabidopsis Biological 
Resource Center (ABRC) at Ohio State University. One of 

2 5 these clones, GenBank accession T04217, was used to 

screen an Arabidopsis cDNA library for full length 
clones. The EST cDNAs were gel purified from low-melting 
agarose and probes prepared by labeling with [a 32 P] dATP 
using a random prime oligonucleotide labeling kit 

3 0 (Pharmacia, Piscataway, NJ) . Probes were desalted using 

Sephadex G-5 0 chromatography to remove unincorporated 
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nucleotides. An Arahidopsis cDNA library (A-PRL2, 
obtained from the ABRC) was plated at a density of 2.9xl0 4 
plaques per plate for a total of 2.03x10 s plaques. 
Biotrace NT nylon filters (Gelman, Ann Arbor, MI) were 
5 used for plaque- lifts and were processed according to the 
manufacturer's specifications. Prehybridization and 
hybridizations were performed according to Current 
Protocols in Molecular Biology, (Ausubel, et al . , 1994). 
After three successive rounds of screening, 5 independent 

10 potential El 3 cDNA clones were isolated, ranging in size 
from 500 to 14 0 0 bp. Two of the five cDNA clones were 
selected for sequencing. Plaque-purified X phage were 
treated according to the manufacturer's instructions 
(GibcoBRL, Gaithersburg, MD) in order to excise the pZL-1 

15 recombinant clones. The cDNA sequences were obtained by 
sequencing both strands of the cDNA clone (and deletion 
fragments derived therefrom) using the Dye-deoxy 
terminating cycle sequencing reactions and an ABI prism 
Model 377 sequencer, according to the manuf acuturer 1 s 

2 0 instructions. Results from sequencing reactions were 
analyzed using IntelliGenetics GeneWorks DNA analysis 
program version 2 . 5 for Macintosh computers . Both cDNAs 
were identical. The BCOADC E1(S cDNA is 1319 bp (Appendix 
G and SEQ ID NO: 13) and contains a 13 3 bp 5' untranslated 

25 region, an open reading frame of 1056 bp followed by 130 

bp 3' untranslated region. The open reading frame encodes 
a protein with 352 deduced amino acids (Appendix H and 
SEQ ID NO: 14) with a calculated mass of 37,810 Daltons. 
Table 4 shows the alignment of the deduced amino 

30 acid sequences of various BCOADC El(3 subunits . "." 

indicates conserved amino acids; "-" indicates a gap 
inserted to maximize homology. The deduced amino acid 
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sequence is 59% identical to the mammalian BCOADC Elp 
subunit (Table 4) . The primary sequence contains no 
obvious organellar targeting information. 

The cDNA was expressed in E . coli after insertion 
5 into the plasmid vector pMal {New England Biolabs) . The 
purified protein was used to prepare polyclonal 
antibodies which recognize the recombinant protein. 



Example 5 
Cloning and Sequencing of cDNA 
10 Encoding the Arabidopsis thaliana 

Dihydrolipoamide S-acyltransf erase (E2) Component 
of the Branched- Chain Oxoacid Dehydrogenase Complex 
A search of the Arabidopsis expressed sequence 
tagged (EST) database identified two Arabidopsis thaliana 
15 EST clones which have significant homology to the bovine 
and human branched- chain dihydrolipoamide acyltransf erase 
subunit. These clones (GenBank accessions T42996 and 
N37840) were obtained from the Arabidopsis Biological 
Resource Center (ABRC) at Ohio State University. 
2 0 Sequencing of the 5' ends of the two clones showed only 
one to be a branched-chain E2 sequence (the other 
contained vector sequence only) . The branched-chain EST 
clone (GenBank accession T42996) was sequenced completely 
on both strands by subcloning of restriction enzyme 
2 5 derived fragments and by primer walking. Sequencing 

reactions and analysis were performed as in Example 1. 

The clone (SEQ ID NO: 15) is 1618 bp in length and 
contains an open reading frame of 1449 bp encoding a 
protein of 483 amino acids (SEQ ID NO: 16) with a 
30 predicted molecular mass of 52,729 daltons . Part of the 
cDNA clone (coding for the lipoyl and subunit -binding 
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domains, and part of the catalytic domain) was expressed 
in E . coli using the pET2 8a expression vector (Novagen, 
Madison, WI) . The recombinant protein (which includes a 
C-terminal six histidine tag) was purified under 
5 denaturing conditions by Ni-NTA affinity chromatography 
according to the manufacturer's instructions (Qiagen 
Inc., Chatsworth, CA) . Polyclonal antibodies were raised 
to the recombinant protein in New Zealand White rabbits. 
These antibodies recognize the recombinant protein at a 
10 high dilution (>1 : 100 , 000) . 

Example 6 
Engineering Chimeric Branched Chain 
Oxoacid Dehydrogenase Complex Elot and E1B Subunits 
to Utilize the Plastid 
15 Pyruvate Dehydrogenase Complex E2 and E3 Components 
to Form a Hybrid Complex 
The cDNA (or other encoding DNA) of the BCOADC El 3 
subunit can be used to form a chimeric protein targeted 
to the plastid to utilize the plastid pyruvate 
2 0 dehydrogenase complex (PDC) E2 component to produce 

propionyl-CoA. The chimeric BCOADC El 3 subunit can be 
modified to comprise the E2 binding region of the plastid 
PDC E13 subunit and a plastid targeting sequence. The 
thus modified BCOADC E13 subunit can then be imported 
2 5 into the chloroplast, where it binds to the plastid PDC 
E2 component and, in conjunction with the plastid PDC E3 
component, catalyzes the production of propionyl-CoA from 
2 -oxybutyrate . This leads to the production of the PHA 
precursor 3 -hydroxyvaleryl -CoA, and consequently to 
30 biosynthesis of the PHA co-polymer poly (3HB-CO-3HV) in 

plants that have been engineered to contain other enzymes 
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necessary for biosynthesis of this copolymer, as 
discussed above. 

The nucleotide sequence that encodes the BCOADC El (3 
region 1 (the region or domain of the El 3 protein that 
5 binds the BCOADC Elf3 component to the E2 core of the 

BCOADC complex [Wexler et al . , 1991]) can be excised and 
replaced with the nucleotide sequence corresponding to 
the PDC E2 binding region from the plastid PDC El [3 
subunit (Johnston et al . , 1997; Luethy et al . , 1994). 

10 The construct can be further engineered to comprise a 
plastid targeting sequence of another plastid protein 
such as the Rubisco small subunit (Table 1) (von Heijne 
et al., 1991), or to comprise the plastid targeting 
sequence of the plastid PDC El(3 subunit described by 

15 Johnston et al . (1997). See Figure 7B. 

Chimeric fusions of plastid targeting sequences and 
the BCOADC Ela and El(3 subunits can be generated by 
amplifying fragments of DNA coding for the regions 
involved. Chloroplast targeting peptides from each of 

20 the plastid PDC El subunits (PDC Ela and El(3) (Johnston 
et al . , 1997) can be amplified from the original cDNAs 
(SEQ ID NOs 1 and 3) . Similarly, the mature portions of 
the BCOADC Ela and El(3 subunits can be amplified from 
their cDNAs (SEQ ID NOs 11 and 13) . A unique restriction 

25 site can be included in the primer design to permit 

ligation of the chloroplast targeting peptides in- frame 
with the mature portions of the BCOADC Ela and El(3 
subunits . 

To produce a BCOADC El (3 chimera that can associate 
3 0 with the PDC E2 subunit, one can modify the BCOADC El[3 

subunit to include the plastid PDC El(3 targeting peptide 
along with the plastid PDC El (3 E2 binding region. In the 
final construct, the sequence for the E2 binding region 
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follows (i.e., is 3' to) the sequence for the targeting 
peptide, so that the chimeric BCOADC El (3 protein contains 
approximately one-third plastid PDC El (3 presequence (for 
example, amino acid residues 1 through 146 of SEQ ID 
5 NO: 4) and the remainder consists of the BCOADC El(3 

subunit (for example, amino acid residues 94 through 352 
of SEQ ID NO:14). The PDC El (3 chloroplast targeting 
peptide and plastid PDC E2 binding region of the PDC El (3 
subunit can be amplified from the plastid PDC El(3 cDNA 
10 ( SEQ ID NO: 4) using the following gene specific primer 

(SEQ ID NO:28) and a commercially available primer (e.g. 
M13/pUC forward primer, available from e.g. Stratagene, 
La Jolla, CA) . 

Forward oligonucleotide: 5' GGGCCC CATATG TCTTCGATAATC 3' 

15 (SEQ ID NO: 28) . Nucleotides 7 through 21 are preceded by 
an Ndel enzyme site. 

The mature part of the BCOADC El (3 sequence 
(excluding the native BCOADC E2 binding site) can be 
amplified from the cDNA of SEQ ID NO: 13 using the 

20 following gene specific primers: 

Forward oligonucleotide: 5' GGGCCC ACCGGT TTTGGCATTGGTCTA 
3' (SEQ ID NO: 24) . Nucleotides 406 through 42 3 are 
preceded by an Agel enzyme site. 
Reverse oligonucleotide: 5' GGGCCC GAATTC 

25 TCATTACTAGTAATTCAC AGT 3' (SEQ ID NO: 25) . Nucleotides 
1177 through 1191 are preceded by an EcoRl enzyme site. 

The resulting truncated BCOADC El (3 sequence can be 
ligated to the plastid PDC El(3 sequence using the Agel 
enzyme site already present in the plastid PDC sequence 

30 at a convenient position (amino acid residue 146) . The 
above primers can be utilized to produce DNA fragments 
useful in joining the noted regions of the plastid PDC 
and BCOADC El (3 sequences without any introduced or 
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substituted amino acids (Figure 7B) . 

To produce a BCOADC Ela chimera that can be targeted 
to a plastid, a chloroplast targeting peptide, for 
example the chloroplast targeting peptide from the 
5 plastid PDC Ela subunit (Johnston et al . , 1997) 

(corresponding to amino acid residues 1 through 68) can 
be attached 5' to the mature portion of the BCOADC Ela 
subunit. A DNA fragment corresponding to the plastid 
targeting peptide can be amplified from the original PDC 
10 Ela cDNA (SEQ ID NO:l) using the following gene specific 
primers (SEQ ID NO: 29 and SEQ ID NO: 3 0) : 

Forward primer: 5' GGGCCC CCATGG CGACGGCTTTCGCT 3' (SEQ 
ID NO: 29) . Nucleotides 107 to 124 are preceded by an 
Ncol enzyme site. 
15 Reverse primer: 5' GGGCCC TGATCA TATTATTGGTGGATTGCTT 3' 

(SEQ ID NO: 30) . Nucleotides 311 to 328 are preceded by a 
Bell enzyme site. 

The entire mature coding region of the BCOADC Ela 
subunit can then be excised from the cDNA (SEQ ID NO: 11) 
20 using convenient restriction enzyme sites, Bell at 

nucleotides 195 through 200, and Xbal at nucleotides 1424 
through 1429. This includes the 3' stop codon. 

The restriction enzyme fragments generated from both 
the plastid PDC and BCOADC Ela sequences can then be 

2 5 ligated together and subcloned into an appropriate vector 

(e.g. pZLl, Life Technologies Inc., Gaithersberg, MD) . 
The Bell site used to ligate the two sequences introduces 
a single His residue between the plastid PDC El p. 
targeting peptide and the BCOADC Ela mature region. 

3 0 The consequence of this addition can be determined 

experimentally to assess its impact, if any, on import 
and processing of the BCOADC Ela subunit, and on assembly 
of the hybrid BCOADC El complex. 
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An alternative approach to ligating the plastid PDC 
and BCOADC Ela sequences using the Bell site is to use a 
NotI site in its place in the design of the reverse 
oligonucleotide for the plastid targeting peptide, as 
5 follows (SEQ ID N0:19): 

Plastid PDC Ela reverse primer: 5' GGGCCC GCGGCCGC 
ATTATTGGTGGATTGCTT 3' (SEQ ID NO: 19). Nucleotides 311 
through 328 are preceded by a NotI enzyme site. 

The coding region for the mature BCOADC Ela protein 
10 (Appendix F and SEQ ID NO: 12) can then be amplified from 
the cDNA (SEQ ID NO: 11) using the following gene-specific 
primers : 

Forward primer : 5 ' GGGCCC GCGGCCGC TGATCATTTGGTTCAGCAG 3 ' 
(SEQ ID NO: 20) . Nucleotides 195 through 213 are preceded 
15 by a NotI enzyme site. 

Reverse primer: 5' GGGCCC GTCGAC TCAAACATGAAAGCCAGG 3' 
(SEQ ID NO: 21) . Nucleotides 1405 through 1422 are 

preceded by a Sail enzyme site and includes the stop 

codon . 

2 0 Ligation of the two resulting sequences using the 

NotI enzyme site will introduce three Ala residues 
between them, which would overcome the introduction of a 
charged residue (His) using the Bell site described 
above . 

25 To confirm the ability of the chimeric BCOADC Ela 

and El|3 proteins to be imported into chloroplasts , the 
DNA encoding these chimeric proteins can be subcloned 
into a transcription vector such as pZLl (Life 
Technologies Inc., Gaithersberg, MD) with the T7 

3 0 promoter. The chimeric proteins are then 

transcribed/translated in vitro, for example using the 
TnT™ transcription/translation system (Life Technologies 
Inc.), and import assays with isolated chloroplasts can 
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be performed. This is a reliable assay to test the 
import and assembly of the chimeric proteins. 

Experimental results have established that in vitro 
imported plastid PDC Ela and El (3 subunit proteins 
5 associate to form the plastid pyruvate dehydrogenase 

heterotetramer within the chloroplast matrix, and that 
this heterotetramer associates with imported PDC E2 
subunits (Randall et al . , unpublished). 

To obtain constitutive expression of the chimeric 

10 proteins in plants, their coding regions are preferably 
fused to the CaMV 3 5S promoter sequence. For 
dicotyledonous plants, the use of the pZP2 0 0 binary 
vector, for Agrobacteriuw transformation, is preferred. 

The chimeric nucleic acids disclosed above are used 

15 to transform Arabidopsis tha liana or other plants by 
various methods well known in the art. As one 
alternative, the BCOADC Ela-chimeric construct comprising 
the plastid PDC Ela. targeting sequence is used to produce 
transformed plants that are then crossed with plants that 

20 have been transformed with the BCOADC El(3-chimeric 
construct containing the plastid PDC El 3 subunit 
targeting sequence and E2 component binding region. 

As another alternative, a compound construct 
containing both the plastid-targeted BCOADC Ela-chimera 

25 and the plastid-targeted BCOADC El(3-chimera containing 

the PDC Eip E2 binding region is constructed in the form 
of a mega plasmid and used to transform plants by 
standard protocols for expression of both subunit 
chimeras simultaneously (Figure 7D) . This can be 

3 0 achieved by including a stop signal at the 3' end of the 
BCOADC Ela chimeric sequence and a NOS transcription 
termination sequence. In order to obtain co-expression 
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of the two chimeric sequences, a second CaMV 3 5S promoter 
sequence can be placed 3 ' to the transcription 
termination sequence of the plastid- targeted BCOADC Ela 
chimeric coding sequence. This second promoter sequence 
5 can in turn be followed by the sequence coding for the 
BCOADC El (3 chimera. This creates a mega plasmid or 
compound construct coding for both the BCOADC Ela and (3 
subunit chimeras (Figure 7D) . 

The BCOADC Ela and 3 subunit chimeras thus targeted 

10 to the plastid bind to the plastid PDC E2 component (E2 

components form the core of the complexes to which the El 
and E3 components bind) . Since the chimeric BCOADC El 3 
subunit comprises the plastid PDC El(3> E2 binding domain, 
a hybrid complex is formed. This hybrid complex is 

15 designed to have an enhanced ability to utilize 2- 

oxobutyrate as substrate in order to produce propionyl- 
CoA for 3-HV biosynthesis. Transgenic plants containing 
this hybrid complex can then be crossed by standard 
protocols with plants having enhanced ability to generate 

2 0 2-oxobutyrate in the plastid compartment produced as 
described, for example, in Gruys et al . (1998). 

Example 7 

Targeting the BCOADC Ela, El 3, and E2 components 
to the Plastid to Form a Hybrid Complex 

2 5 with the Plastid PDC E3 Component 

DNAs encoding the BCOADC Ela and (3. subunit s and E2 
component can be fused with plastid targeting sequences 
to direct importation of these proteins into the plastid 
to enhance propionyl-CoA production from 2-oxobutyrate. 

3 0 In this method, constructs of the BCOADC Ela and p 

subunits, the BCOADC E2 component, and, if desired, the 
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BCOADC E3 subunit, can be made with plastid targeting 
sequences, for example with plastid targeting sequences 
of the plastid pyruvate dehydrogenase complex (PDC) Ela 
and (3 subunits (Johnston et al . , 1997) or the plastid PDC 
5 E2 component. See Figures 7A, 7C, and 7E. These 

constructs can be used to transform plants individually 
(followed by genetic crossing to combine the necessary 
components from each plant) or together to direct the 
desired BCOADC components to the plastid. The BCOADC 

10 Ela-chimera is as described above in Example 6. The 
BCOADC El(3-chimera containing the PDC E10 E2 binding 
region is also described in Example 6. When the plastid- 
targeted BCOADC E2 chimera is also employed (see below) , 
the E2 binding region of the BCOADC El (3 subunit need not 

15 be replaced with the plastid PDC Eip subunit E2 binding 
region. Instead, only the plastid PDC E13 targeting 
peptide is attached to the mature portion of the BCOADC 
E13 subunit (still retaining the native binding site for 
the BCOADC E2 component) (Figure 7E) . This can be 

2 0 achieved by amplifying the appropriate regions of the PDC 

and BCOADC El 3 cDNA sequences or other functionally 
equivalent DNA sequences. That portion of the cDNA 
coding for the plastid targeting peptide of the PDC E13 
(amino acids 1 through 97) can be amplified from the cDNA 
25 (SEQ ID NO.:3) using the following gene specific primers. 
This amplified fragment includes a portion of the linker 
region between the targeting peptide and the E2 -binding 
region. 

Forward oligonucleotide: 5' GGGCCC CATATG TCTTCGATAATC 3' 

3 0 (SEQ ID NO: 22) . Nucleotides 7 through 21 are preceded by 

an Ndel enzyme site. 

Reverse oligonucleotide: 5' GGGCCC CTCGAG ACCTTCCTGAAGAGC 
3' (SEQ ID N0:23) . Nucleotides 277 through 297 are 
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preceded by an Xhol enzyme site. 

The mature portion of the BCOADC El (3 sequence 
(including the native BCOADC E2 binding region), i.e., 
amino acid residues 45 through 349, can be amplified from 
5 the cDNA of SEQ ID NO: 13 using the following gene 
specific primers: 

Forward oligonucleotide: 5' GGGCCC CTCGAG ATCGCTTTGGACACC 
3' (SEQ ID NO: 31) . Nucleotides 262 through 277 are 
preceded by an Xhol enzyme site. 

10 Reverse oligonucleotide: 5' GGGCCC GAATTC 

TCATTACTAGTAATTCAC AGT 3' (SEQ ID NO: 25) . Nucleotides 
1177 through 1191 are preceded by an EcoRl enzyme site. 

Use of the foregoing oligonucleotide primers allows 
the joining of the appropriate plastid PDC and BCOADC El (3 

15 sequences without any introduced or substituted amino 
acids (Figure 7E) . As disclosed in Example 6, the 
resulting DNA can be subcloned into a transcription 
vector to test import and assembly prior to 
transformation of Arabidopsis or other plants (or prior 

2 0 to the construction of a mega plasmid for co-expression, 
cf . Figure 7D) . 

Further to the above, a chimera comprising the 
plastid targeting sequence (nucleotides 59-232) of the 
plastid PDC E2 (dihydrolipoamide acetyltransf erase) 

2 5 component and the sequence for the mature BCOADC 

dihydrolipoamide acyltransf erase (E2) subunit can be 
constructed. The 

N- terminus of the BCOADC E2 subunit can be replaced with 
the chloroplast targeting peptide from the plastid PDC E2 
30 subunit. In this case, the native E2 binding domain of 
the BCOADC El (3 subunit need not be replaced with the E2 
binding domain of the plastid PDC El(3 subunit as 
described in Example 6. Only the plastid PDC E2 
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targeting peptide is needed because the BCOADC E2 
component which is imported into the plastid will 
naturally associate with the BCOADC El(3 subunit . 

The plastid targeting sequence can be amplified 
from the plastid PDC E2 cDNA of SEQ ID NO : 5 using the 
following gene-specific primers: 

Forward primer: 5' GGGCCC CATATG GCGGTTTCTTCT 3' ( SEQ 
ID NO: 26) . Nucleotides 5 9 through 73 are preceded by 
an Ndel enzyme site. 

Reverse primer; 5' GGGCCC CCATGGC AATTTCAGGATTCTT 3' 
(SEQ ID NO: 27) . Nucleotides 218 through 232 are 
preceded by an Ncol enzyme site. 

The region coding for the mature portion of the 
BCOADC E2 protein can be excised from the cDNA (SEQ 
ID NO.:15) using convenient restriction enzymes (Ncol 
and Notl) . This DNA fragment is then ligated in- 
frame with the PDC E2 plastid targeting peptide using 
the common Ncol enzyme site (Figure 7C) . As 
described in Example 6, the import and assembly of 
this chimeric E2 subunit can be examined by in vitro 
import assays. Efficient import of the BCOADC E2 
protein into isolated pea chloroplasts and formation 
of a complex with both the endogenous PDC 
heterotetramer and imported BCOADC Ela-El(3 
heterotetramer can be determined. 

The plastid-targeted branched- chain oxoacid 
dehydrogenase complex components utilize any 
2-oxobutyrate (a-ketobutyrate) produced in the 
plastid to make propionyl CoA, which in turn is a 
substrate for the enzymes producing 
polyhydroxyalkanoic acids (PHAs) . 

As previously indicated, it appears to be 
unnecessary to prepare a plastid-targeted construct 
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for the BCOADC E3 component since the E3 components 
of all of the mitochondrial a-ketoacid dehydrogenase 
complexes appear to be interchangeable. The PDC E3 
component already present in the plastid should 
5 function with the plast id-targeted BCOADC Ela, El(3, 
and E2 subunits. If desired, one can, for example, 
place a plastid targeting sequence on the 
mitochondrial E3 component in place of the first 31 
amino acids of the mitochondrial PDC E3 reported by 

10 Turner et al . (1992) {GenBank accession number 

X2995) , corresponding to the first 72 nucleotides of 
that particular cDNA. This is done by standard 
protocols well known to those skilled in the art. 

As discussed above, the plastid is capable of 

15 PHA biosynthesis when the appropriate enzymes are 

present in the plant (Poirier et al . , 1992; Nawrath 
et al . , 1994). Targeting BCOADC subunits and 
components to this organelle as described in Examples 
6 and 7 herein further enhances ability of plants to 

2 0 biosynthesize the 3HB-co-3HV copolymer. 

The invention being thus described, it will be 
obvious that the same can be varied in many ways . 
Such variations are not to be regarded as a departure 
from the spirit and scope of the present invention, 

25 and all such modifications and equivalents as would 
be obvious to one skilled in the art are intended to 
be included within the scope of the following claims. 
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What Is Claimed Is ; 

1. An isolated DNA molecule, comprising a 
nucleotide sequence selected from the group 
consisting of: 

5 (a) the nucleotide sequence shown in SEQ ID 

N0:1, or the complement thereof; 

(b) a nucleotide sequence that hybridizes to 
said nucleotide sequence of (a) under a wash 
stringency equivalent to 0 . 5X SSC to 2X SSC, 0.1% 

10 SDS, at 55-65°C, and which encodes a polypeptide 
having enzymatic activity similar to that of 
Arabidopsis thaliana plastid pyruvate dehydrogenase 
complex Ela subunit; 

(c) a nucleotide sequence encoding the same 
15 genetic information as said nucleotide sequence of 

(a) , but which is degenerate in accordance with the 
degeneracy of the genetic code; and 

(d) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

2 0 (b) , but which is degenerate in accordance with the 
degeneracy of the genetic code . 

2. A recombinant vector, comprising said 
isolated DNA molecule of claim 1. 

3 . A host cell transformed with said 
recombinant vector of claim 2 . 

4 . An isolated polypeptide having the amino 
acid sequence of SEQ ID NO . : 2 . 

5. An isolated DNA molecule, comprising a 
nucleotide sequence selected from the group 
consisting of: 
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(a) the nucleotide sequence shown in SEQ ID 
5 NO : 3 , or the complement thereof ; 

(b) a nucleotide sequence that hybridizes to 
said nucleotide sequence of (a) under a wash 
stringency equivalent to 0 . 5X SSC to 2X SSC, 0.1% 
SDS, at 55-65°C ; and which encodes a polypeptide 

10 having enzymatic activity similar to that of 

Arabidopsis thaliana plastid pyruvate dehydrogenase 
complex Eip subunit; 

(c) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

15 (a) , but which is degenerate in accordance with the 
degeneracy of the genetic code; and 

(d) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 
(b) , but which is degenerate in accordance with the 
degeneracy of the genetic code. 

6. A recombinant vector, comprising said 
isolated DNA molecule of claim 5. 

7. A host cell transformed with said 
recombinant vector of claim 6 . 

8 . An isolated polypeptide having the amino 
acid sequence of SEQ ID NO . : 4 . 

9. An isolated DNA molecule, comprising a 
nucleotide sequence selected from the group 
consisting of: 

(a) the nucleotide sequence shown in SEQ ID 
5 NO: 5, or the complement thereof; 

(b) a nucleotide sequence that hybridizes to 
said nucleotide sequence of (a) under a wash 
stringency equivalent to 0 . 5X SSC to 2X SSC, 0.1% 
SDS, at 55-65°C, and which encodes a polypeptide 
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having enzymatic activity similar to that of 
Arabidopsis thaliana plastid pyruvate dehydrogenase 
complex E2 component ; 

(c) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

(a) , but which is degenerate in accordance with the 
degeneracy of the genetic code; and 

(d) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

(b) , but which is degenerate in accordance with the 
degeneracy of the genetic code. 

10. A recombinant vector, comprising said 
isolated DNA molecule of claim 9. 

11. A host cell transformed with said 
recombinant vector of claim 10. 

12 . An isolated polypeptide having the amino 
acid sequence of SEQ ID NO.:6. 

13. An isolated DNA molecule, comprising a 
nucleotide sequence selected from the group 
consisting of: 

(a) the nucleotide sequence shown in SEQ ID 
NO: 11, or the complement thereof; 

(b) a nucleotide sequence that hybridizes to 
said nucleotide sequence of (a) under a wash 
stringency equivalent to 0 . 5X SSC to 2X SSC, 0.1% 
SDS, at 55-65°C, and which encodes a polypeptide 
having enzymatic activity similar to that of 
Arabidopsis thaliana branched chain 2-oxoacid 
dehydrogenase complex Ela subunit; 

(c) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 
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(a) , but which is degenerate in accordance with the 
degeneracy of the genetic code; and 

(d) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

(b) , but which is degenerate in accordance with the 
degeneracy of the genetic code. 

14. A recombinant vector, comprising said 
isolated DNA molecule of claim 13. 

15. A host cell transformed with said 
recombinant vector of claim 14. 

16. An isolated polypeptide having the amino 
acid sequence of SEQ ID NO.: 12. 

17. An isolated DNA molecule, comprising a 
nucleotide sequence selected from the group 
consisting of: 

(a) the nucleotide sequence shown in SEQ ID 
NO: 13, or the complement thereof; 

(b) a nucleotide sequence that hybridizes to 
said nucleotide sequence of (a) under a wash 
stringency equivalent to 0 . 5X SSC to 2X SSC, 0.1% 
SDS, at 55-65°C, and which encodes a polypeptide 
having enzymatic activity similar to that of 
Arabidopsis thaliana branched chain 2-oxoacid 
dehydrogenase complex El(3 subunit ; 

(c) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

(a) , but which is degenerate in accordance with the 
degeneracy of the genetic code; and 

(d) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

(b) , but which is degenerate in accordance with the 
degeneracy of the genetic code. 
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18. A recombinant vector, comprising said 
isolated DNA molecule of claim 17. 

19. A host cell transformed with said 
recombinant vector of claim 18 . 

20. An isolated polypeptide having the amino 
acid sequence of SEQ ID NO.: 14. 

21. The isolated DNA molecule of claim 17, 
wherein the naturally occurring branched chain 
oxoacid dehydrogenase complex E2 component binding 
region thereof is replaced with the E2 component 

5 binding region of a plastid pyruvate dehydrogenase 
complex El|3 subunit . 

22. The isolated DNA molecule of claim 21, 
wherein said plastid pyruvate dehydrogenase complex 
El|3 subunit has the sequence shown in SEQ ID NO . : 3 . 

23 . A recombinant vector, comprising said 
isolated DNA molecule of claim 22. 

24. A host cell transformed with said 
recombinant vector of claim 23. 

25. An isolated DNA molecule, comprising a 
nucleotide sequence selected from the group 
consisting of: 

(a) the nucleotide sequence shown in SEQ ID 
5 NO: 15, or the complement thereof; 

(b) a nucleotide sequence that hybridizes to 
said nucleotide sequence of (a) under a wash 
stringency equivalent to 0 . 5X SSC to 2X SSC, 0.1% 
SDS, at 55-65°C, and which encodes a polypeptide 

10 having enzymatic activity similar to that of 
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Arabidopsis thaliana branched chain 2-oxoacid 
dehydrogenase complex E2 component; 

(c) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

(a) , but which is degenerate in accordance with the 
degeneracy of the genetic code; and 

(d) a nucleotide sequence encoding the same 
genetic information as said nucleotide sequence of 

(b) , but which is degenerate in accordance with the 
degeneracy of the genetic code. 

26. A recombinant vector, comprising said 
isolated DNA molecule of claim 25. 

27. A host cell transformed with said 
recombinant vector of claim 26. 

28. An isolated polypeptide having the amino 
acid sequence of SEQ ID NO.: 16. 

29. A plant, a plastid of which comprises the 
following polypeptides: 

an enzyme that enhances the biosynthesis of 
2 -oxobutyrate ; 

a branched chain oxoacid dehydrogenase complex 
El a subunit; 

a branched chain oxoacid dehydrogenase complex 
El (3 subunit; and 

a branched chain oxoacid dehydrogenase complex 
E2 component . 

30. The plant of claim 29, wherein said 
branched chain oxoacid dehydrogenase complex Ela 
subunit has the sequence shown in SEQ ID NO.: 12, said 
branched chain oxoacid dehydrogenase complex El 3 
subunit has the sequence shown in SEQ ID NO . : 14 , or 
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said branched chain oxoacid dehydrogenase complex E2 
component has the sequence shown in SEQ ID NO.: 16. 

31. The plant of claim 29, wherein said plastid 
further comprises the following polypeptides: 

a 3-ketothiolase; 

a |3-ketoacyl-CoA reductase; and 

a polyhydroxyalkanoate synthase . 

32. The plant of claim 31, the genome of which 
comprises introduced DNAs encoding said polypeptides, 
wherein each of said introduced DNAs is operatively 
linked to a targeting peptide coding region capable 
of directing transport of said polypeptide encoded 
thereby into a plastid. 

33. A method of producing P (3HB-CO-3HV) 
copolymer, comprising growing said plant of claim 32, 
and recovering P (3HB-CO-3HV) copolymer produced 
thereby. 

34. A plant, a plastid of which comprises the 
following polypeptides: 

an enzyme that enhances the biosynthesis of 
2 -oxobutyrate ; 

a branched chain oxoacid dehydrogenase complex 
Ela subunit; 

a branched chain oxoacid dehydrogenase complex 
El (3 subunit; 

a branched chain oxoacid dehydrogenase complex 
E2 component ; and 

a dihydrolipoamide dehydrogenase E3 component. 

35. The plant of claim 34, wherein said 
branched chain oxoacid dehydrogenase complex Ela 
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subunit has the sequence shown in SEQ ID NO.: 12, said 
branched chain oxoacid dehydrogenase complex El (3 
subunit has the sequence shown in SEQ ID NO . : 14 , or 
said branched chain oxoacid dehydrogenase complex E2 
component has the sequence shown in SEQ ID NO.: 16. 

36. The plant of claim 34, wherein said plastid 
further comprises the following polypeptides: 

a (3-ketothiolase; 

a (3-ketoacyl-CoA reductase; and 

a polyhydroxyalkanoate synthase. 

37. The plant of claim 36, the genome of which 
comprises introduced DNAs encoding said polypeptides, 
wherein each of said introduced DNAs is operatively 
linked to a targeting peptide coding region capable 
of directing transport of said polypeptide encoded 
thereby into a plastid. 

38. A method of producing P (3HB-CO-3HV) 
copolymer, comprising growing said plant of claim 3 7 
and recovering P (3HB-co-3HV) copolymer produced 
thereby. 

39. A plant, a plastid of which comprises the 
following polypeptides: 

an enzyme that enhances the biosynthesis of 
2 -oxobutyrate ; 

a branched chain oxoacid dehydrogenase complex 
Ela subunit; and 

a branched chain oxoacid dehydrogenase complex 
El (3 subunit, the naturally occurring E2 binding 
region of which is replaced with the E2 binding 
region of a plastid pyruvate dehydrogenase complex 
El (3 subunit. 
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40. The plant of claim 39, wherein said 
branched chain oxoacid dehydrogenase complex Ela 
subunit has the sequence shown in SEQ ID NO.: 12. 

41. The plant of claim 39, wherein said plastid 
further comprises the following polypeptides: 

a p-ketothiolase; 

a (3-ketoacyl-CoA reductase; and 

a polyhydroxyalkanoate synthase . 



42. The plant of claim 41, the genome of which 
comprises introduced DNAs encoding said polypeptides 
wherein each of said introduced DNAs is operatively 
linked to a targeting peptide coding region capable 

5 of directing transport of said polypeptide encoded 
thereby into a plastid. 

43. A method of producing P (3HB-CO-3HV) 
copolymer, comprising growing said plant of clai m 42 
and recovering P (3HB-CO-3HV) copolymer produced 
thereby. 
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ABSTRACT OF THE DISCLOSURE 

Provided are nucleic acid coding sequences and 
methods utilizing these sequences for optimizing 
levels of substrates employed in the biosynthesis of 
copolymers of 3 -hydroxybutyrate (3HB) and 3 -hydroxy- 
valerate (3HV) in plants via manipulation of normal 
metabolic pathways using recombinant techniques. 
This optimization is achieved through the use of a 
variety of wild-type and/or deregulated enzymes 
involved in the biosynthesis of aspartate family 
amino acids, and wild-type or deregulated forms of 
enzymes, such as threonine deaminase, involved in the 
conversion of threonine to P (3HB-co-3HV) copolymer 
endproduct . These enzymes are used in conjunction 
with the Ela, El(3, E2 , and E3 subunits of plastid 
pyruvate dehydrogenase complexes and branched chain 
oxoacid dehydrogenase complexes or mitochondrial 
dihydrolipoamide dehydrogenase E3 components to 
enhance the levels of threonine, 2 -oxobutyrate (a- 
keto-butyrate) , propionate, propionyl -CoA, (3- 
ketovaleryl-CoA, and (3-hydroxyvaleryl -CoA. Also 
provided are methods for the biological production of 
P (3HB-co-3HV) copolymer in plants utilizing the 
enhanced levels of _ propionyl -CoA produced therein. 
Introduction into plants of an appropriate 
(3-ketothiolase, a (3-ketoacyl-CoA reductase, and a PHA 
synthase in combinations with the aforementioned 
enzymes will permit such plants to produce 
commercially useful amounts of P (3HB-CO-3HV) 
copolymers . 
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BRANCHED-CHAIN E1o 



PLASTID E1o 



PLASTID TARGETED BRANCHED- 
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CONSTRUCT 1: ATTACH THE CHLOROPLAST 
TARGETING PEPTIDE OF E1a TO THE BRANCHED- 
CHAIN E1a. THIS CREATES A PLASTID TARGETED 
BRANCHED-CHAIN E1a CHIMERA 




PLAST1D E1/ 



PLASTID TARGETED BRANCHED- 
CHAIN El jg CHIMERA 



CONSTRUCT 2: REPLACE THE N- TERMINUS OF 
THE BRANCHED-CHAIN E1/3 (INCLUDING THE E2 
BINDING DOMAIN) WITH THE N-TERMINUS OF THE 
PLASTID E1fi (INCLUDING THE CHLOROPLAST 
TARGETING PEPTIDE AND THE PLASTID E2 
BINDING DOMAIN). THIS CREATES A PLASTID 
TARGETED BRANCHED-CHAIN E1/5 CHIMERA. 
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PLASTID E2 

L 
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CONSTRUCT 3: ATTACH THE CHLOROPLAST 
TARGETING PEPTIDE OP THE PLASTID E2 TO 
THE MATURE PORTION OP THE BRANCHED- 
CHAIN E2, TO CREATE A PLASTID TARGETED 
BRANCHED-CHAIN E2 CHIMERA 
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CONSTRUCT 4: MEGA PLASMID CODING FOR 
BOTH CHIMERIC (PLASTID TARGETED BRANCHED- 
CHAIN) SUBUNITS IF THE PDH. ATTACH THE El a 
CHIMERIC SEQUENCE. TO THE E1/S CHIMERIC 
SEQUENCE WITH TRANSCRIPTION TERMINATOR 
AND PROMOTER SEQUENCES BETWEEN THE TWO 
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TARGETING 
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PLASTID E1g 



PLASTID TARGETED BRANCHED- 
CHAIN E1/3 CHIMERA 



CONSTRUCT 5: ATTACH THE CHLOROPLAST 
TARGETING PEPTIDE OF THE PLASTID E1jS 
TO THE MATURE PORTION OF THE BRANCHED- 
CHAiN E1& THIS CREATES A PLASTID TARGETED 
BRANCHED-CHAIN E1j6 CHIMERA. 



FIG. 8A 



PlastidA.t. MATAFAPTKLTATVPLHGSHENRLLLPIRLAPPSSFLGSTRSLSLRRLNH 50 

P . purpurea 

A.thaliana MALSRLSSRSNIITRPFSAAFSRLIS 26 

H. sapiens II MRKMLAAVSRVLSGASQKPASRVLVAS 27 

S.cerevisiae MLAASFKRQPSQLVRGLGAVLRTPTRIGHVRTMATLKTTDKKAPEDI 4 7 

A. suum I MIFVFANIFKVPTVSPSVMAISV 23 

M.capricolum MTYL 4 

B. subtilis MGVKTFQFPFAEQL 14 

Consensus 50 



Motif 1 

SNATRRSPWSVQEWKEKQSTNNTSLLITKEEGLELYEDMILGRSFEDM 1< 

MSYPKKVELPLTNCNQINLTKHKLLVLYEDMLLGRNFEDM 

TDTTPITIETSLPFTAHLCDPPSRSVESSSQELLD-FFRTMALMRRMEIA 
RNFANDATFEIKKCDLHRLEEGPPVTTVLTREDGLKYYRMMQTVRRMELK 
EGSDTVQIELPESSFESYMLEPPDLSYETSKATLLQMYKDMVIIRRMEMA 
RLASTEATFQTKPFKLHKLDSGPDINVHVTKEDAVHYYTQMLTIRRMESA 
GKFDPLKNEKVCVLDKDGKVINPKLMPKISDQEILEAYKIMNLSRRQDIY 
EKVAEQFPTFQILNEEGEWNEEAMPELSDEQLKE-LMRRMVYTRILDQR 

L . . Y . . M. . .RR.E. . H 



o 

CAQMYYRGKMFGFVHLYNGQEAVSTGFIKLLTKSDSWSTYRDHVHALSK 1 
CAQMYYKGKMFGFVHLYNGQEAVSTGVIKLLDSKDYVCSTYRDHVHALSK 

ADSLYKANVIRGFCHLYDGQEAVAIGMEAAITKKDAI ITAYRDHCIFLGR i: 

ADQLYKQKI IRGFCHLCDGQEACCVGLEAGINPTDHLITAYRAHGFTFTR 1 : 

CDALYKAKKIRGFCHIiSVGQEAIAVGIENAITKLDSIITSYRCHGFTFMR 1- 

AGNLYKEKKVRGFCHLYSGQEACAVGTKAAMDAGDAAVTAYRCHGWTYLS 1 : 

QNTMQRQGRLLSFLSSTGQEACEVAYINALNKKTDHFVSGYRNNAAWLAM 1< 

S I SLNRQGRL - GFYAPTAGQEASQ I ASHFALEKEDF I LPGYRDVPQ 1 1 WH 1 : 

. . . LY GF . HL . . GQEA ...G K.D YR.H 1! 



/3//1 

FIG. 8B 

TPP -binding site . 

GVSARAVMSELFGKVTGCCRGQGGSMHMFSKEHNMLGGFAFIGEGI PVAT 2 00 

GVPSQNVMAELFGKETGCSRGRGGSMHIFSAPHNFLGGFAFIAEGI PVAT 14 0 

GGSLHEVFSELMGRQAGCSKGKGGSMHFYKKESSFYGGHGIVGAQVPLGC 17 5 

GLSVREILAELTGRKGGCAKGKGGSMHMYAKN- -FYGGNGIVGAQVPLGA 175 

GASVK7WLAELMGRRAGVSYGKGGSMHLYAPG - - FYGGNGIVGAQVPLGA 19 5 

GSSVAKVLCELTGRITGNVYGKGGSMHMYGEN- -FYGGNGIVGAQQPLGT 171 

GQL VRN I ML YW I GNE AG - GKAP EG - VNCL P PN IVIGSQYSQAT 14 5 

GLPLYQAFLFSRGHFHG-NQIPEG-VNVLPPQ III GAQYI QAA 153 

G.S...V..EL.G...G.. .G.GGSMH --F.GG. .I.GAQ.P. . . 200 

PDH 6 binding site 

GAAFSSKYRREVLKQDCD-DVTVAFFGDGTCNNGQFFECLNMAALYKLPI 249 

GAAFQSIYRQQVLKEPGELRVTACFFGDGTTNNGQFFECLNMAVLWKLPI 190 

GIAFAQKYNKE EA VTFALYGDGAANQGQLFEALNI SALWDLPA 218 

GIALACKYNGK- - -DE VCLTLYGDGAANQGQI FEAYNMAALWKLPC 218 

GLAFAHQYKNE- - -DA CS FTLYGDGASNQGQVFE S FNMAKLWNL P V 23 8 

G I AFAMKYRKE KN VCI TMFGDGATNQGQLFE SMNMAKLWDL P V 214 

GIAFADKYRKT GG WVTTTGDGGSSEGETYEAMNFAKLHEVPC 18 8 

GVALGLKMRGK- - -KA VAITYTGDGGTSQGDFYEGINFAGAFKAPA 196 

G.AFA.KYR. . . V. .T. . GDG . . NQGQ . FE - .NMA.LW.LP. 250 

*3 

IFWENNLWAIGMSHLRATSDPEIWKKGPAFGMPGVHVDGMDVLKVREVA 299 

I FWENNQWAI GMAHHRS SSI PE IHKKAEAFGLPGI EVDGMDVLAVRQVA 240 

I LVCENNH YGMGTAEWRAAKS PS YYKRGD - Y - VPGLKVDGMDAFAVKQAC 26 6 

I F ICENNRYGMGTS VERAAASTD YYKRGD - F - I PGLRVDGMD I LCVREAT 26 6 

VFCCENNKYGMGTAAS RS S AMTE YFKRGQ - Y - 1 PGLKVNGMDI LAVYQAS 2 8 6 

LYVCENNGYGMGTAAARS SASTDYYTRGD - Y - VPG I WVDGMD VLAVRQAV 2 62 

I FVIENNKWAISTARSEQTKS INFAVKGIATGI PSI IVDGNDYLACIGVF 23 8 

I FWQNNRFAISTPVEKQTVAKTLAQKAVAAGI PGIQVDGMDPLAVYAAV 24 6 

IFV.ENN. . . . GTA. . R K.G PG . . VDGMD . LAV . .A. 300 

*1 . 2 

KEAVTRARRGEGPTLVECETYRFRGHSLADPD- ELRDAAE - KAKYAARDP 34 7 

EKAVERARQGQGPTLIEALTYRFRGHSLADPD-ELRSRQE - KEAWVARDP 28 8 

KFAKQHALE - KGP 1 1 LEMDTYRYHGHSMSDPGS TYRTRDE I SGVRQERD P 315 

RFAAAYCRSGKGPILMELQTYRYHGHSMSDPGVSYRTREEIQEVRSKSDP 316 

KFAKDWCLSGKGPLVLEYETYRYGGHSMS DPGTT YRTRDE I QHMRS KND P 33 6 

RWAKEWCNAGKGPLMIEMATYRYSGHSMSDPGTSYRTREEVQEVRKTRDP 312 

KEWEYVRKGNGPVLVECDTYRLGAHSSSDNPDAYRPKGEFEEM-AKFDP 287 

KAARERAI NGEGPTLI ETLCFRYGPHTMS GDDPTRYRS KELENEWAKKD P 2 96 

K.A G.GP.L.E. . TYRY . GHSMSDP . . .YR.R.E DP 350 



141 ft 



FIG. 8C 



IAALKKYLIENKLAKEAELKSIEKKIDELVEEAVEFADASPQPG- -RSQL 3 95 

IKKLKKHILDNQIASSDELNDIQSSVKIDLEQSVEFAMSSPEPN--ISEL 3 36 

IERI KKLVLSHDLATEKELKDMEKEIRKEVDDAIAKAKDCPMPE - - PSEL 3 63 

IMLLKDRMVNSNLASVEELKEIDVEVRKEIEDAAQFATADPEPP--LEEL 3 64 

IAGLKMHLIDLGIATEAEVKAYDKSARKYVDEQVELADAAPPPEAKLS IL 3 8 6 

ITGFKDKIVTAGLVTEDEIKEIDKQVRKEIDAAVKQAHTDKESPVELMLT 3 62 

LIRLKQYLIDKKIWSDEQQAQLEAEQDKFVADEFAWVEKNKNYDL-IDIF 33 6 

LVRFRKFLEAKGL WSEEEENNV I EQAKEE I KEAI KKADETP KQK - - VTDL 3 44 

I..LK LA.E.E.K K A...A...P.P.--...L 400 

LENVFAD PKGFGIG PDGR YRCED PKFTEG - TAQV 4 2 8 

K RY LFADN 344 

FTNVYV - - KGFG TES FGPDRKEVKAS -LP- - 389 

GYHIYSSDPPF EVRGANQWIKFKSVS 3 90 

FEDVYVKGTETPTLRGRI PEDTWDFKKQGFASRD 4 2 0 

DI YYNTPAQYVRCTTDEVLQKYLTSEEAVKALAK 3 96 

KYQYDKMDIFLEEQYKEAKEFFEKYPESKEGGHH 3 70 

ISIMFE-ELPF NLKEQYEIYKEKESK-- 3 69 



434 



15/ n 



FIG. 9A 



Plastid A.t. MSSIIHGAGAATTTLSTFNSVDSKKLFVAPSRTNLSVRSQRYIVAGSDAS 50 

P .purpurea * 

A.thaliana 

H. sapiens 

S.cerevisiae MFS 3 

A . suum 

M.capricolum 

B. subtilis 

Consensus 50 



KKSFGSGLRVRHSQKLIPNAVATKEADTSASTGHELLLFEALQEGLEEEM 10 0 

MSKVFMFDALRAATDEEM 18 

MLG I LRQRAI DGASTLRRTRFALVS ARS YAAGAKEMTVRDALNSAI DEEM 50 

MAAVSGLVRRPLREVSGLLKRRFHWTAPAALQVTVRDAINQGMDEEL 4 7 

RLPT S LARNVARRAPTS FVRPS AAAAALRFS S TKTMTVREALNS AMAEEL 5 3 

- -MAVNGCMRLLRNGLTSACALEQS VRRLASGTLNVTVRDALNAALDEE I 4 8 

MAIINNIKAVTDALDCAM 18 

MAQMTMVQAI TDALR I EL. 18 

-- T. . . AL . .A. DEE. 100 



Recrion 1 

DRDPHVCVMGEDVGHYGGSYKVTKGLADKFGDLRVLDTPICENAFTGMGI 150 

EKDLTVCVI GEDVGHYGGSYKVTKDLHSKYGDLRVL.DTP IAENS FTGMAI 6 8 

S ADPKVFVMGEEVGQYQGAYKI TKGLLEKYG PERVYDT P I TEAGFTG I GV 10 0 

ERDEKVFLLGEEVAQYDGAYKVSRGLWKKYGDKRIIDTPISEMGFAGIAV 97 

DRDDDVFLIGEEVAQYNGAYKVSKGLLDRFGERRWDTPITEYGFTGLAV 103 

KRDDRVFL I GEEVAQYDGAYKI S KGLWKKYGDGRI WDT P I TEMAI AGLS V 98 

QRDPNVI VFGEDVGTEGGVFRATQGLAVKFGNDRCFNAPI SEAMFAGVGL 6 8 

KNDPNVLI FGEDVGVNGGVFRATEGLQAEFGEDRVFDTPLAESGIGGLAI 6 8 

. RD . . V . . . GE . VG . Y . G . YK . TKGL . . K . G . . RV . DTPI .E..F.G... 150 



GAAMTGLRPVI EGMNMGFLLLAFNQ I SNNCGMLHYTSGGQFT I PWIRGP 2 00 

GAA I TGLR P I VEGMNMS FLLLAFNQ I SNNAGMLR YT S GGNFTL P L V I RG P 118 

GAAYAGLKPWEFMTFNFSMQAIDHI INSAAKSNYMSAGQINVPIVFRGP 15 0 

GAAMAGLRPICEFMTFNFSMQAIDQVINSAAKTYYMSGGLQPVPIVFRGP 14 7 

GAALKGLKPIVEFMSFNFSMQAIDHWNSAAKTHYMSGGTQKCQMVFRGP 153 



FIG. 9B 

GAAMNGLR P ICEFMSMNFSMQGIDHII NSAAKAHYM S AGRFH VP I VFRGA 14 8 

GMAMNGMKPVLEMQFEGLGLASLQNIFTNISRMRNRTRGKYTAPMVIRMP 118 

GLALQGFRPVPE I QFFGFVYEVMDS I CGQMAR I RYRTGGR YHMP ITIRSP 118 

GAA. . GLRP . .E.M. . . F. . .A.D.I.N.AA. . . Y . SGG . . . .P.V.RGP 200 



Region 2 

GGVGRQLGAEHSQRLESYFQSIPGIQMVACSTPYNAKGLMKAAIRSENPV 250 

GGVGRQLGAEHSQRLEAYFQAI PGLKI VACSTPYNAKGLLKSAIRDNNPV 168 

NGAAAGVGAQHSQCYAAWYASVPGLKVLAPYSAEDARGLLKAAIRDPDPV 2 0 0 

NGASAGVAAQHSQCFAAWYGHCPGLKWSPWNSEDAKGLIKSAIRDNNPV 197 

NGAAVGLGAQHSQDFSPWYGSI PGLKVLVPYSAEDARGLLKAAIRDPNPV 2 03 

NGAAVGVAQQHSQDFTAWFMHCPGVKWVPYDCEDARGLLKAAVRDDNPV 198 

MGGGIRALEHHSEALEAVYAHIPGVQIVCPSTPYDTKGLILAAIDSPDPV 168 

FGGGVHTPELHSDSLEGLVAQQPGLKWIPSTPYDAKGLLISAIRDNDPV 168 

•G A.HSQ. . . A PGLKW. P. . . . DAKGLLKAAIRD . NPV 250 



ILFEHVLLYN LKE K I PDED Y I CNLE E AEMVRPGEH I T I LTYS RMR Y 2 96 

VFFEHVLLYN LQEEIPEDEYLIPLDKAEWRKGKDITILTYSRMRH 214 

VFLENELLYGESFPISEEALDSSFCLPIGPCAKIEREGKDVTIVTFSKMVG 2 50 

WLENELMYGVP FEFLPEAQS KDFLI P I GKAKI ERQGTH I TWSHS RP VG 247 

VFLENELLYGES FE I S EEALS PEFTLPY- KAKI EREGTDI S I VTYTRNVQ 2 52 

I CLENE I LYGMKF PVS PEAQS PDFVLPFGQAKIQRPGKD I TIVSLSI GVD 24 8 

I WEPTKLYR AFKQEVPDEH Y I VP I GEG YKI QEGNDLT WTYGAQTV 215 

I FLEHL.KLYR - - - S FRQE VPEGE YT I P I GKADI KREGKD I T 1 1 AYGAMVH 215 

. .LE. . LLY E P.GKA.I.R.G.DITIVTYS. .V. 300 



Region 3 

HVMQAAKTL VNK - - G YD P E V I D I R S LKP FDLHT I GNS VKKTHRVL I VEE C 344 

H.VTEALP LLLND - - G YD PE VLDL I S LKP LD I D S I S VS VKKTHRVL I VEEC 2 62 

FALKAAEKLAEE - - G I SAEVINLRS IRPLDRATTNASVRKTSRLVTVEEG 2 98 

HCLEAAAVLSKE - -GVECEVINMRTIRPMDMETIEASVMKTNHLVTVEGG 2 95 

FSLEAAE I LQKKY - GVSAE VINL.RS I RPLDTEAI I KTVKKTNHL I TVEST 3 01 

VS LHAADELAKS - - G I DCEV INLRC VRP LD FQTVKDS VI KTKHLVT VE SG 2 96 

DCQKAIALLKETHPNATIDLIDLRSIKPWDKKMVIESVKKTGRLLWHEA 2 65 

ESLKAAAELEKE- -GISAEWDLRTVQPLDIETI IGSVEKTGRAI WQEA 2 63 

. . L . AA . . L . . . - -G . . .EVI.LRS. . PLD. . TI . . SV. KT . RL . .VEE. 3 50 



Region 4 

MRTGGI GAS LTAAINE - NFHD YLDAP VMCLS SQD VPTP YAGTLEEWT WQ 3 93 



FIG. 9C 

MKTAGIGAELIAQINE-HLFDELDAPWRLSSQDIPTPYNGSLEQATVIQ 311 

FPQHGVCAE I CAS WE - ES FS YLDAPVERI AGADVP I P YTANLERLALPQ 34 7 

WPQFGVGAEICARIMEGPAFNFLDAPAVRVTGADVPMPYAKILEDNSIPQ 34 5 

FPSFGVGAEI VAQVMESEAFDYLDAPIQRVTGADVPTPYAKELEDFAFPD 3 51 

WPNCGVGAEISARVTESDAFGYLDGPILRVTGVDVPMPYAQPLETAALPQ 34 6 

VKSFSVSAEI IATVNE-ECFEYIKAPLSRCTGYDVITPFDRG-EGYFQVN 313 

QRQAGIAANWAEINE - RAILSLEAPVLRVAAPDTVYPFAQA- ESVWLPN 311 

. . . .GVGAEI.A. . .E-. .F.YLDAP. .R. .G.DVP.PYA. . LE . . . . PQ 40 0 

PAQIVTAVEQLCQ 4 06 

PHQI IDAVKNIVNSSKTITT 331 

I ED I VRAS KRACYRS K 3 63 

VKD 1 1 FAI KKTLNI 3 59 

TPTIVKAVKEVLSIE 3 66 

P ADWKMVKKCLNVQ 361 

PKKVLVKMQELLDFKF 32 9 

FKDVI ETAKKVMNF 325 

. . . I . . A . K 420 



fig: 1 oa 



A. t. MAA LLG-RSC RKLSFPSLTHG ARR- 23 

Human MAWAAAAGWLLRLRAAGAEGHWRRLPGAGLARGFLHPAATVEDAAQRRQ 50 

Bovine MAAVAAFAGWLLRLRAAGADGPWRRLCGAGLSRGFLQSASAY-GAAQRRQ 49 

Cons ensus MAAVAA . AGWLLRLRAAGA . G . WRRL . GAGL . RGFL . . A . . . - . AAQRRQ 5 0 



V STETGKP- -LNLYSAINQALHIALDTDPRSYVFGEDVGF 61 

VAHFTFQPDPEPREYGQTQKMNLFQSVTSALDNSLAKDPTAVI FGEDVAF 10 0 

VAHFTFQPDPEPVEYGQTQKMNLFQAVTSALDNSLAKDPTAVI FGEDVAF 9 9 

VAHFTFQPDPEP . EYGQTQKMNLFQAVTSALDNSLAKDPTAVI FGEDVAF 10 0 



GGVFRCTTGLAERFGKNRVFNTPLCEQGI VGFGIGLAAMGNRAIVEIQFA 111 

GGVFRCTVGLRDKYGKDRVFNTPLCEQGIVGFGIGIAVTGATAIAEIQFA 150 

GGVFRCTVGLRDKYGKDRVFNTPLCEQGIVGFGIGIAVTGATAIAEIQFA 14 9 

GGVFRCTVGLRDKYGKDRVFNTPLCEQGI VGFGI G I AVTGATAI AE I QFA 150 



D Y I YPAFDQ I VNE AAKFRYRS GNQFNCGGLT I RAP YGAVGHGGHYHS QS P 161 

D Y I FPAFDQ I VNEAAKYRYRS GDLFNCGS LT I RS PWGCVGHGALYHS Q S P 2 00 

DYIFPAFDQIVNEAAKYRYRSGDLFNCGSLTIRSPWGCVGHGALYHSQSP 19 9 

D Y I FPAFDQ I VNEAAKYRYRS GDLFNCGSLTI RS PWGCVGHGALYHS QS P 2 00 



EAFFCHVPGI KWI PRS PREAKGLLLS C IRDPNP WFFEPKWLYRQAVEE 211 

EAFFAHCPGI KWI PRS PFQAKGLLLSC I EDKNPC I FFEPKI LYRAAAEE 2 50 

EAFFAHCPGIKVWPRSPFQAKGLLLSCIEDKNPCIFFEPKILYRAAVEQ 24 9 

EAFFAHCPGI KWI PRS P FQAKGLLLS C I EDKNP CIFFEPKI LYRAAVEE 250 



FIG. 10B 

VPEHDYMI PLS EAE VI REGND I TLVGWGAQLTVMEQ - ACLDAE KEGI S CE 2 6 0 

VPIEPYNI PLSQAEVIQEGSDVTLVAWGTQVHVIREVASMAKEKLGVSCE 3 0 0 

VPVEPYNI PLSQAEVIQEGSDVTLVAWGTQVHEIREVAAMAQEKLGVSCE 2 99 

VP . EPYNI PLSQAEVIQEGSDVTLVAWGTQVHVIREVA . MA . EKLGVSCE 3 0 0 

LIDLKTLLPWDKETVEASVKKTGRLLISHEAPVTGGFGAEISATILERCF 310 

VIDLRTIIPWDVDTICKSVIKSGRLLISHEAPLTGGFASEISSTVQEECF 35 0 

VI DLRT I LPWDVDTVCKSVI KTGRLLVSHEAPLTGGFASE I SSTVQEQCF 34 9 

V I DLRT I L P WD VDTVCKS V I KTGRLL I S HEAP LTGGFAS E I S S T VQE . CF 350 

LKLEAPVSRVCGLDTPFPLVFEPFYMPTKNKILDAIKSTVNY 352 

LNLEAP I SRVCGYDTPFPHI FEPFYI PDKWKCYDALRKMINY 3 92 

LNLEAPISRVCGYDTPFPHIFEPFYIPDKWKCYDALRKMINY 3 91 

LNLEAPI SRVCGYDTPFPHI FEPFYI PDKWKCYDALRKMINY 3 92 



APPENDIX A 



CA7C7C77G7 TCTCTCCGCC CA7C7C7GC7 CTC7777A77 7TCCCAGAAA GT7TTTTTTT 60 

T7T777CCGA A77CCG77AA TC7CA77GGG GTTTCCATTG A7AGCAA7GG CGACGGC777 120 

CGCTCCCACT AAGCT CACT G CCACGGTTCC TC7GCA7GGA TCCCATGAGA A7CG7C7C7T 130 

GC7CCCGA7C CGA77GGC7C C7CC77C77C T77CC7CGGA 7CCACCCG77 CCC7C7CCC7 240 

TCGCAGAC7C AA7CAC7CCA ACGCCACCCG TCGATC7CCC G7CG7C7C7G 7CCAGGAAG7 300 

7G7CAAGGAG AAGCAA7CCA CCAA7AA7AC CAGCC7G77G A7AACCAAAG AGGAAGGA77 3 60 

GGAG77G7A7 GAAGA7ATGA 7AC7AGG7AG A7C777CGAA GACA7G7G7G C7CAAA7GTA 4 20 

T7ACCGAGGC AAGA7G777G G7777G77CA C77G7ACAA7 GGCCAAGAGG C7G777C7AC 4S0 

7GGC777A7C AAGC7CC77A CGAAG7C7GA C7C7G7CG77 AG7ACC7ACC G7GACCA7G7 540 

CCA7GCCC7C AGCAAAGG7G TC7C7GC7CG TGC7G77A7G AGCGAGC7C7 TCGGCAAC-G7 6QG 

7AC7GGA7GG 7GCAGAGGCC AAGG7GGA7C CA7GCACA7G 77C7CGAAAG AACACAACA7 650 

GC77GG7GGC 777GC7777A T7GG7GAAGG CA77CC7G7C GCCAC7GG7G C7GCC777AG 72 0 

C7 CCAAG7 AC AGGAGGGAAG 7C77GAAACA GGA77G7GA7 GA7G7CAC7G 7CGCC77777 73 0 

CGGAGA7GGA AC77G7AACA ACGGACAG77 C77CGAG7G7 C7 C AACA7 GG C7GC7C7C7A 3 40 

7AAAC7GCG7 A77A7C77TG T7G7CGAGAA 7AAC7 7G7GG GCCA77GGGA 7G7C7CAC77 90 0 

GAGAGCCAC7 7C7GACCCCG AG A.7 7 7 GG AA G AAAG G 7 C C 7 GCA777GGGA 7GCC7GG7G7 950 

7CA7G77GAC GG7A7GGA7G 7C77GAAGG7 CAGGGAA.G7 C GC7AAAGAAG C7G7CAC7AG 102 0 

C7CC77GGC7 GA7CCCGA7G AGC7CCG7GA 7GC7GC7GAC- AAAGC 7AAA7 ACGCGGC7AG 114 0 

AGACCCAA7C GCAGCA77GA AGAAG7A777 GA7AGAGAAC AAGC7 7GCAA AGGAAGCAGA 1200 

GC7AAAG7CA A7AGAGAAAA AGA.7 AGAC GA G77GG7GGAG GAAGCGG77G AG7 77 GCAGA 12 SO 

CGC7AG7CCA CAGCCCGG7C GCAG7CAG77 GC7AGAC-AA7 G7G777GC7G A7CCA.-AAGG 13 20 

A777GGAA77 GGACC7GA7G G AC GG 7 AC AG A7G7GAGGAC CCCAAG777A CCGAA.GGCAC 138 0 

AGC7CAAG7C 7 GAG AAGA.C A AG777AACCA 7AAGC7G777 AC7G7C7C77 CGA7G777C7 14 4 0 

TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 13 30 



APPENDIX B 



MATAFAPTKL TATVPLHGSH ENRLLLPIRL APPSSFLGST RSLSLRRLNH SNATRRSPVV 60 

SVQEWKE KQ STNNtSLLIT KEEGLELYED MILGRSFEDM CAQMYYRGKM FGFVHLYNGQ 120 

EAVSTGFIKL LTKSDSWST YRDHVHALSK GVSARAVMSE LFGKVTGCCR GQGGSMHMFS 180 

KEHNMLGGFA FIGEGIPVAT GAAFSSKYRR. EVLKQDCDDV TVAFFGDGTC NNGQFFECLN 240 

MAALYKLPII FVVENNLWAI GMSHLRATSD PEIWKKGPAF GMPGVHVDGM DVLKVREVAK 300 

EAVTRARRGE GPTLVECETY RFRGHSLADP DELRDAAEKA KYAARDPIAA LKKYLIENKL 360 

AKEAELKSIE KKIDELVEEA VEFADASPQP GRSQLLENVF ADPKGFGIGP DGRYRCEDPK 420 

FTEGTAOV 4. 7 o 



APPENDIX C 



GAAAAAATGT CT7CGA7AAT CCATGGAGCT GGAGCTGCTA CGACGACGTT ATCGACG7TT 60 

AA7TCCGTCG ATTCCAAGAA ACTCTTCGTT GCTCCTTCTC GCACAAATCT TTCAGTGAGG 12Q 

AGCCAGAGAT ATATAGTGGC TGGATCTGAT GCGAGTAAGA AGAGCTTTGG TTCTGGACTT 180 

AGAGTTCGTC ACTCTCAGAA ATTGATTCCA AATGCTGTTG CGACGAAGGA GGCGGATACG 240 

TCTGCGAGCA CTGGACATGA ACTATTGCTT TTCGAGGCTC TTCAGGAAGG TC7GGAAGAA 300 

GAGATGGACA GAGATCCACA TGTA7GTG7T A7GGG7GAAG ATGTTGGCCA 77ACGGAGGT 360 

TCCTACAAGG TAACCA&AGG CCTTGCTGAT AAA777GG7G ACCTCAGGGT TCTCGACACT 420 

CCTATTTGTG AAAATGCATT CACCGGTATG GGCATTGGAG CTGCCATGAC TGGTCTAAGA 480 

CCCGTTATTG AAGGTATGAA GATGGG777C CTCCTCCTCG CCTTCAACCA AATCTCCAAC 540 

AACTGTGGAA TGCTTCACTA CACATCCGGT GGTCAGTTTA CGATCCCGGT TGTCATCCGT 600 

GGACCTGGTG GAGTGGGACG CCAGCTTGGT GCTGAGCATT CACAGAGGTT AGAATCTTAC 660 

TTTCAGTCCA 7CCC7GGGAT CCAGATGGTT GCTTGCTCAA C7CCT7ACAA CGCCAAAGGG 720 

TTGATGAAAG CCGCAATAAG AAGCGAGAAC CCTGTGATTC TGTTCGAACA CGTGCTGCTT 78 0 

TACAATCTCA AG G AG AAAA7 CCCGGATGAA GATTACATCT GTAACCTTGA AGAAGCTGAG 8 40 

ATGGTCAGAC CTGGCGAGCA CAT 7 AC CATC CTCACTTACT CGCGAATGAG GTACCATGTC- 90 C 

ATGCAGGCAG CAAAAAC7C7 GGTGAACAAA GGG7ATGACC CCGAGGTTAT CGACATCAGG 960 

TCAC7GAAAC CGT7CGACCT TCACACAATT GGAAAC7CGG TGAAGAAAAC ACATCGGG77 1020 

T7GATCGTGG AGGAG7GTAT GAGAACCGG7 GGGATTGGGG CAAG7CTTAC AGCTGCCATC 108 0 

AACGAGAACT TTCA'TGACTA CTTAGATGCT CCGGTGATGT G7T7A7C77C TCAAGACGTT 114 0 

CCTACACCTT ACCCTGGTAC ACTGGAGGAG TGGACCG7GG TTCAACCGGC TCAGATCGTG 120 C 

ACCGC7GTCG AGCAGCT77G CCAGTAAATT CAT AT 7 7 AT C CGATGAACCA TTATTTATCA 12 60 

I'TTTACCTCTC CATT7CC77T CTCTGTAGCT TAGTTCTTAA AGAAT7TGTC TAAGATGGTT 132C 

[TG77777G77 AAAG777C-7C TCCTTTGTTG TGTCTTTTAA TATGGTTTC-T AAC 7 C AG AAT 138 0 

"GTTTGTTTGT TAATTTTATC TCCCACTTTC TTTTAAAAAA AAAAAAAAAA AAAAAAAAAA 14 4 G 

A 14 4 1 



APPENDIX D 



MSSIIHGAGA ATTTLSTFNS VDSKKLFVAP SRTNLSVRSQ RYIVAGSDAS KK5FGSGLRV 60 

RHSOKLIPNA VATK SADTSA STGHELLLFE ALQEGLEEEM DRDPHVCVMG EDVGHYGGSY 120 

KVTKGLADKF GDLRVLDTPI CENAFTGMGI GAAMTGLRPV IEGMNMGFLL LAFNQISNNC 180 

GMLHYTSGGQ FTIPWIRGP GGVGRQLGAE HSQRLESYFQ SIPGIQMVAC STPYNAKGLM 240 

KAAIRSENPV ILFEHVLLYN LKEKIPDEDY ICNLEEAEMV RPGEHITILT YSRMRYHVMQ 300 

AAKTLVNKGY DPEVIDIRSL KPFDLHTIGN SVKKTHRVLI VEECMRTGGI GASLTAAINE 360 

NFHDYLDAPV MCLSSQDVPT PYAGTLEEWT WQPAQIVTA VEQLCQ _ 406 



APPENDIX E 



G GGCGA1CTG GITTGCTAGA TCCAAAACCC TTGTTTGT AG GTTGAGAGAT 50 

A&TCTAAATT TGTCGAGAA.T ICKATAAAA GGTGATIAGT PTCATCGTCC 100 

CAICTIUTAT AGAACTTCTC AGXTAICTKI A^mr-GIAT TTGAGTrrGT 150 

TCGGTAGCCT CCGltATGAG TCTACGGCCG TGGAGAGAGA GGGTCATTAT 200 

TTGGTirAn c AGATTGATGA AGTCGATGCC CAGGA ACTGG XWVTCCZn?. 250 

AGGCAAAGIC GGTTA2ACAT CGGAGATGAA ATICATACrG GAATTATnT 300 

CAAGGAGGAT TCCATGTTAC CGGGTIGTTG AGGAAGAGGG AGGAATTATT 350 

CCCGATAGGG A1TTTA1TCC GGTGAGTGAG AAAGTTGGTG TTAGAATCTPA 400 

CGAAjGAAATG GCGACGCTAC AAGTAATGGA TCAGA TCTTC TAGGA A HTTP 450 

AACGICAAGG AAGAATATCT TTTTATCTIA CITCCGTTGG AGAAGAAG^G 500 

ATTAAGA-.TGG CTTCAGGAGC TGCTCTCAGT GCTGAGGAGG TCGTrrrr^Cr 550 

TCAGTACCGA GAACCTGGAG TIGTITTGTG GCG^GG^r ACGT^GGAGG 600 

AGTITGCTAA TGAGTGTTTT GGGAAGAAAG CTGAW ATGG GAAArGTar-A 650 

CAAATGCGAA TTCATTAGGG TTCGAATCGT CTTAA'T^ ACT TC^CTZTrrr 700 

MlUICCAAIT GCCACGCAAC TTCCTCAAGC TGCTGGAfrrr GGXTA^^p 750 

SGAAAATGGA CAAGAAGAAT GCTTGTACTG T^ACATTGAv GGGAGATGGT 800 

GGCACAAGCG AGGGAGA.TTT TCACGGCGGA TTGAA jITTTG GGG^rHTAA^ 850 

GGAAGCTCCG GTTGTGTTTA TATGTCGGAA CAAGG GTTGG GGGA'TTArrra 900 

(STCATATCTC AGAACAGITT AjGAAGTGATG GAA.1AGITGT GAAAGGTGAA 950 

faTAGGGTA TCCCGAAGCA TCCCGTGTGG GACGC^A CCG ATC-CACTTC-r 1000 

GGTITATAGT GCTGTAGGCT CAGCTCGAGA AATGGGT GIA AGAGAAGAAA 1050 

GACCTGTTGT CATTGAGATG AXGACATA1A GAGTAagACA TCATTCTAGA 1100 

ttlAGATGATT CAAGTAAGTA CAGGGCGGCG GATGA AATCC AGTAGTrcAA 1150 

Mtgtcgaga AACCCTGTGA ATAGATTICG GAAAT GGGTC GAAGATAAGG 1200 

GATGGTGGAG TGAGGAAGAT GAATCCAAGC TAAGATC TAA GGGAAGAAAA 1250 

CAGCTIGTGG AAGCGATICA GGCTGCGGAG AAGTG GGAGA AAGAAGGA^ 1300 

GACAGAGTTG TTTAACGATG TATATGA~GT TAAAG CGAAG AAGGTAnaaG 1350 

AGCAAGAACT TGGTTTGAAG GAATTAGTAA AGAAAGAAG C TCA AGATTAT 1 1400 

CCTGCTGGGT TTCATGTT TG AAIUTAGAGG AACTGTGTGG TTAAAATACC 1450 

TCGCGGACCG CGAATTCGAT ATCAAGCTIC TCATTGCAGA CTA1TTATAT 1500 

TGTCCACGTA TCGAATAGTA ATCAAGTATC AATGTAGAGA CCAGGATTTG 1550 

GAGCA1CAAA AAAAAAAAAA AAAAAAAAAA AAAAAAA 1587 



APPENDIX F 



A I WFARSK7L VSSLRHNLNL STILIKROYS KRPiFYTTSG LSSTAYLSPF 50 

GSLRHESTAV ETQADHLVGQ I DEVDAGELD FPGG.KVGY7S EYKF[PE3SS IOC 

RRIPCYRVLD EDGR i I FCSD F i R VSEKLA / RYYEHYA'lG VYCH : FYEAG 150 

;§GGR:SFYL T S'/GEEA I >\ '. A. SAAALSPjQv 7_=C V REFGV LLWRGF7LES 2CC 

, TPP binding site 

IHANCCFGNKA DYGXGRCMF I HYGSMR_NYF TiSSPIATCL FQAAGVGYSL 250 

I_ BCOADC E16 binding site 

WKOKKNACTV TF I GDGG7SE GDFHAGLNFA A /^EAF V"7F i CRMNGWA[S7 300 

:H.ISE:FRSCG I VVKGCA V G [ P KrR 7 w'CG~G A_A / V 3AVRS AFEYA V 7 ICR 350 
• O 

■i;VL ! EYtf 7YR VGHK57S0CS TKYSAACEIC Y'.vXYSRNF VN RFRKWVEDMC- 400 

WWSEEDE3KL RSNARKGLLG a;GAA.E.<WE< CPLTZ_-niCV YQVKRKNLEE 450 

QELGLKELVK KQPQDYPPGF HV 472 



APPENDIX G 



10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

TTCITCACCC MlAAMGm GCAAACCTTT GCGACCTAAA AAilCTTACCA 50 

GTTGGGTGAA AGTTGCCAAA ATAGAGCTTG CTTTTGTCGZ AATCCTATAT 100 

ttitcagatt gattgttggt GGGirroiGi aaatggcggc tcttttaggc 150 

AGATCCTGCC GGAAACTGAG TTITCCGAGC TTGACTCACG GAGCTAGGAG 200 

(fe^TCGACG GAAACTGGAA AACCATTGAA TCLATAGTCT GCTATTAATC 250 

||GCGCTTCA CATCGCTTTG GAGACCGATC CrCGGTCTTA TGTCTTTGGG 300 

(MAGACGTIG GCITIGGIGG AGTCTTTCGC TGTAGAACTG GTITAGGTGA 350 

iiZGAITGGGG AAAAACCGIG TCTTCAATAC TCCTCTTTGT GAGCAGGGCA 400 

ifGTTGGATT TGGCATTGGT CIAGCAGCAA TGGGTAATCG AGCAATTGTA 450 

GAGATIGAGT TTGCAGATTA TATATATCCT GCTITTGATC AGATTGTTAA 500 

ISAAGCTGCA AAGITGAGAT ACCGAAGTGG TAACCAATTC AACTGTGGAG 550 

(gCTTACGAT AAGAGCACCA TATGGAGCAG TIGGICAIGG TGGACATTAC 600 

G1HTCACAAT CCCCTGAAGC TTICTITrGC CATGrCCCTG GTATTAAGGT 650 

TteATCOCT CGGAGTCCAC GAGAAGCAAA GGGACTGTTG TTGTCATGTA. 700 

TCCGTGATCC AAATCGCGTT GTTTTCTTCG AACCAAAGTG GCTGTA1CGT 750 

CAAGCAGTAG AAGAAGTCCC TGAGCATGAC TATAIGATAC CTTTATCAGA 800 

AGCAGAGGTT A1AAGAGAAG GCAATGACAT TACACTGGTT GGATGGGGAG 850 

CTCAGCTTAC CGTTATCGAA CAAGCTTGIG TGGACGCGGA AAAGGA AGGA 900 

ATATCATGIG AACTGATAGA TCTCAAGACA CTGCTIGCTT GGGACAAAGA 950 

AACCGTTGAG GCTTCAGTTA AAAAGACIGG CAGACTTCTT ATAAGCCAIG 1000 

AAGCIGCTGT AACAGGAGGT TTIGGAGGAG AGATCTCICC AAGAATTCTG 1050 

GAACGTIGCT TTTIGAAGIT AGAAGCTCCA GTAAGCAGAG TTIGIGGTCT 1100 

GGATACTCCA TTrCCTCTTG TCTTIGAACC ATICTACATG CCCACCAAGA 1150 

ACAAGATATT GGATGCAATC AAATCGACTG TGAATTAGTA GCXZGTACTAT 1200 

CTCTAjGTTTA CTCTTIACAC TAGGAITIAAT GTAATCGCAT GTCTTTGTTA 1250 

TCAATICGTC TAATGIAACA CTACCGAITA ACTTTAATGA ATTICA AGAT 1300 

AACGAAAAAA AAAAAAAAA 1319 



APPENDIX H 



10 20 30 40 50 

1234567890 1234567890 1234567890 1234567890 1234567890 

MAATTrasra KLSFPSLTEG ARKVSTEIGK PLNLYSAINQ ^IHEALDIDP 50 

RSYVFGEDVG FGGVFRCITG LAERFGKNRV" FNTPLCEOGI VGFGIGLAAM 100 

GNRAIVEIQF ADYIYPAFEQ TVNEAAKFRY RSQvJQFNCGG LTIRAPYGAV 150 

GHGGHYHSQS PEAFFCHVPG IKWIPRSPR E^JCGLLL'SCI RDFNFVVFFE 200 

PKWLYRQAVE EVPEHDYMTP LSEAEVTREG HDITLVG^GA QLTVKEQACL 250 

DAEKEGISCE LZDLKTLLPW DKETVEASVK KTGRLLISKE APVTGGFGAE 300 

ISATILERCF LKLEAFVSRV CGLDTPFPLV FEPFYMPTKN KTLDAIKSTV 350 

M 352 
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SEQUENCE LISTING 



<110> Randall, Douglas D. 
Johnston, Mark L. 
Miernyk, Jan A. 
Luethy, Michael H. 
Mooney, Brian P. 



<12 0> USE OF DNA ENCODING PLASTID PYRUVATE DEHYDROGENASE AND 
BRANCHED CHAIN OXOACID DEHYDROGENASE COMPONENTS TO 
ENHANCE POLYHYDROXYALKANOATE BIOSYNTHESIS IN PLANTS 



<130> UMO 1482 



<140> 09/108,020 

<141> 1998-06-30 

<150> 60/051,291 

<151> 1997-06-30 



<150> 60/055,255 
<151> 1997-08-01 



<150> 60/076,544 
<151> 1998-03-02 



<160> 54 



<170> Patentln Ver. 2.1 



<210> 1 

<211> 1541 

<212> DNA 

<213> Arabidopsis thaliana 



<400> 1 

ccacgcgtcc gcatctcttg ttctctccgc ccatctctgc tctcttttat tttcccagaa 60 
agtttttttt tttttttccg aattccgtta atctcattgg ggtttccatt gatagcaatg 120 
gcgacggctt tcgctcccac taagctcact gccacggttc ctctgcatgg atcccatgag 180 
aatcgtctct tgctcccgat ccgattggct cctccttctt ctttcctcgg atccacccgt 240 
tccctctccc ttcgcagact caatcactcc aacgccaccc gtcgatctcc cgtcgtctct 3 00 
gtccaggaag ttgtcaagga gaagcaatcc accaataata ccagcctgtt gataaccaaa 3 60 
gaggaaggat tggagttgta tgaagatatg atactaggta gatctttcga agacatgtgt 420 
gctcaaatgt attaccgagg caagatgttt ggttttgttc acttgtacaa tggccaagag 4 80 
gctgtttcta ctggctttat caagctcctt accaagtctg actctgtcgt tagtacctac 540 
cgtgaccatg tccatgccct cagcaaaggt gtctctgctc gtgctgttat gagcgagctc 600 
ttcggcaagg ttactggatg ctgcagaggc caaggtggat ccatgcacat gttctccaaa 660 
gaacacaaca tgcttggtgg ctttgctttt attggtgaag gcattcctgt cgccactggt 720 



gctgccttta gctccaagta caggagggaa 
gtcgcctttt tcggagatgg aacttgtaac 
gctgctctct ataaactgcc tattatcttt 
atgtctcact tgagagccac ttctgacccc 
atgcctggtg ttcatgttga cggtatggat 
gctgtcacta gagctagaag aggagaaggt 
ttcagaggac actccttggc tgatcccgat 
tacgcggcta gagacccaat cgcagcattg 
aaggaagcag agctaaagtc aatagagaaa 
gagtttgcag acgctagtcc acagcccggt 
gatccaaaag gatttggaat tggacctgat 
accgaaggca cagctcaagt ctgagaagac 
tcgatgtttc tatatatctt attaagttaa 
cactttttgc ttaaaaaaaa aaaaaaaaaa 



gtcttgaaac aggattgtga tgatgtcact 780 
aacggacagt tcttcgagtg tctcaacatg 840 
gttgtcgaga ataacttgtg ggccattggg 900 
gagatttgga agaaaggtcc tgcatttggg 960 
gtcttgaagg tcagggaagt cgctaaagaa 1020 
ccaaccttgg ttgaatgtga gacttataga 1080 
gagctccgtg atgctgctga gaaagccaaa 1140 
aagaagtatt tgatagagaa caagcttgca 12 00 
aagatagacg agttggtgga ggaagcggtt 12 60 
cgcagtcagt tgctagagaa tgtgtttgct 13 20 
ggacggtaca gatgtgagga ccccaagttt 13 80 
aagtttaacc ataagctgtc tactgtctct 1440 
atgctacaga gaatcagttt gaatcatttg 1500 
aaaaaaaaaa a 1541 



<210> 2 
<211> 428 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 2 

Met Ala Thr Ala Phe Ala Pro Thr Lys Leu Thr Ala Thr Val Pro Leu 
15 10 15 

His Gly Ser His Glu Asn Arg Leu Leu Leu Pro lie Arg Leu Ala Pro 
20 25 30 

Pro Ser Ser Phe Leu Gly Ser Thr Arg Ser Leu Ser Leu Arg Arg Leu 
35 40 45 

Asn His Ser Asn Ala Thr Arg Arg Ser Pro Val Val Ser Val Gin Glu 
50 55 60 

Val Val Lys Glu Lys Gin Ser Thr Asn Asn Thr Ser Leu Leu lie Thr 
65 70 75 80 

Lys Glu Glu Gly Leu Glu Leu Tyr Glu Asp Met lie Leu Gly Arg Ser 
85 90 95 

Phe Glu Asp Met Cys Ala Gin Met Tyr Tyr Arg Gly Lys Met Phe Gly 
100 105 110 

Phe Val His Leu Tyr Asn Gly Gin Glu Ala Val Ser Thr Gly Phe He 
115 120 125 

Lys Leu Leu Thr Lys Ser Asp Ser Val Val Ser Thr Tyr Arg Asp His 
130 135 140 



2 



Val His Ala Leu. Ser Lys Gly Val Ser Ala Arg Ala Val Met Ser Glu 
145 150 155 160 



Leu Phe Gly Lys Val 
165 

His Met Phe Ser Lys 
180 

Gly Glu Gly He Pro 
195 

Arg Arg Glu Val Leu 
210 

Phe Gly Asp Gly Thr 
225 

Met Ala Ala Leu Tyr 
245 

Leu Trp Ala He Gly 
260 

He Trp Lys Lys Gly 
275 

Gly Met Asp Val Leu 
290 

Arg Ala Arg Arg Gly 
305 

Arg Phe Arg Gly His 
325 

Ala Glu Lys Ala Lys 
340 

Lys Tyr Leu lie Glu 
355 

He Glu Lys Lys He 
370 

Asp Ala Ser Pro Gin 
335 



Thr Gly Cys Cys Arg Gly 
170 

Glu His Asn Met Leu Gly 
185 

Val Ala Thr Gly Ala Ala 
200 

Lys Gin Asp Cys Asp Asp 
215 

Cys Asn Asn Gly Gin Phe 
230 235 

Lys Leu Pro He He Phe 
250 

Met Ser His Leu Arg Ala 
265 

Pro Ala Phe Gly Met Pro 
280 

Lys Val Arg Glu Val Ala 
295 

Glu Gly Pro Thr Leu Val 
310 315 

Ser Leu Ala Asp Pro Asp 
330 

Tyr Ala Ala Arg Asp Pro 
345 

Asn Lys Leu Ala Lys Glu 
360 

Asp Glu Leu Val Glu Glu 
375 

Pro Gly Arg Ser Gin Leu 
390 395 



Gin Gly Gly Ser Met 
175 

Gly Phe Ala Phe He 
190 

Phe Ser Ser Lys Tyr 
205 

Val Thr Val Ala Phe 
220 

Phe Glu Cys Leu Asn 
240 

Val Val Glu Asn Asn 
255 

Thr Ser Asp Pro Glu 
270 

Gly Val His Val Asp 
285 

Lys Glu Ala Val Thr 
300 

Glu Cys Glu Thr Tyr 
320 

Glu Leu Arg Asp Ala 
335 

He Ala Ala Leu Lys 
350 

Ala Glu Leu Lys Ser 
365 

Ala Val Glu Phe Ala 
380 

Leu Glu Asn Val Phe 
400 



3 



Ala Asp Pro Lys Gly Phe Gly lie Gly Pro Asp Gly Arg Tyr Arg Cys 
405 410 415 

Glu Asp Pro Lys Phe Thr Glu Gly Thr Ala Gin Val 
420 425 



<210> 3 
<211> 1441 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 3 

gaaaaaatgt cttcgataat ccatggagct ggagctgcta cgacgacgtt atcgacgttt 60 
aattccgtcg attccaagaa actcttcgtt gctccttctc gcacaaatct ttcagtgagg 120 
agccagagat atatagtggc tggatctgat gcgagtaaga agagctttgg ttctggactt 180 
agagttcgtc actctcagaa attgattcca aatgctgttg cgacgaagga ggcggatacg 240 
tctgcgagca ctggacatga actattgctt ttcgaggctc ttcaggaagg tctggaagaa 3 00 
gagatggaca gagatccaca tgtatgtgtt atgggtgaag atgttggcca ttacggaggt 3 60 
tcctacaagg taaccaaagg ccttgctgat aaatttggtg acctcagggt tctcgacact 420 
cctatttgtg aaaatgcatt caccggtatg ggcattggag ctgccatgac tggtctaaga 4 80 
cccgttattg aaggtatgaa catgggtttc ctcctcctcg ccttcaacca aatctccaac 540 
aactgtggaa tgcttcacta cacatccggt ggtcagttta cgatcccggt tgtcatccgt 600 
ggacctggtg gagtgggacg ccagcttggt gctgagcatt cacagaggtt agaatcttac 660 
tttcagtcca tccctgggat ccagatggtt gcttgctcaa ctccttacaa cgccaaaggg 720 
ttgatgaaag ccgcaataag aagcgagaac cctgtgattc tgttcgaaca cgtgctgctt 780 
tacaatctca aggagaaaat cccggatgaa gattacatct gtaaccttga agaagctgag 840 
atggtcagac ctggcgagca cattaccatc ctcacttact cgcgaatgag gtaccatgtg 900 
atgcaggcag caaaaactct ggtgaacaaa gggtatgacc ccgaggttat cgacatcagg 960 
tcactgaaac cgttcgacct tcacacaatt ggaaactcgg tgaagaaaac acatcgggtt 102 0 
ttgatcgtgg aggagtgtat gagaaccggt gggattgggg caagtcttac agctgccatc 108 0 
aacgagaact ttcatgacta cttagatgct ccggtgatgt gtttatcttc tcaagacgtt 114 0 
cctacacctt acgctggtac actggaggag tggaccgtgg ttcaaccggc tcagatcgtg 1200 
accgctgtcg agcagctttg ccagtaaatt catatttatc cgatgaacca ttatttatca 1260 
tttacctctc catttccttt ctctgtagct tagttcttaa agaatttgtc taagatggtt 1320 
tgtttttgtt aaagtttgtc tcctttgttg tgtcttttaa tatggtttgt aactcagaat 1380 
gtttgtttgt taattttatc tcccactttc ttttaaaaaa aaaaaaaaaa aaaaaaaaaa 144 0 
a 1441 



<210> 4 
<211> 406 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 4 

Met Ser Ser lie lie His Gly Ala Gly Ala Ala Thr Thr Thr Leu Ser 



4 



Thr Phe Asn Ser 
20 

Thr Asn Leu Ser 
35 

Ala Ser Lys Lys 
50 

Lys Leu lie Pro 
65 

Ser Thr Gly His 



Glu Glu Glu Met 
100 

Val Gly His Tyr 
115 

Lys Phe Gly Asp 
130 

Phe Thr Gly Met 
145 

He Glu Gly Met 



Ser Asn Asn Cys 
180 

He Pro Val Val 
195 

Ala Glu His Ser 
210 

He Gin Met Val 
225 

Lys Ala Ala He 



Leu Leu Tyr Asn 



Val Asp Ser Lys 



Val Arg Ser Gin 
40 

Ser Phe Gly Ser 
55 

Asn Ala Val Ala 
70 

Glu Leu Leu Leu 
85 

Asp Arg Asp Pro 



Gly Gly Ser Tyr 
12 0 

Leu Arg Val Leu 
135 

Gly He Gly Ala 
150 

Asn Met Gly Phe 
165 

Gly Met Leu His 



He Arg Gly Pro 
200 

Gin Arg Leu Glu 
215 

Ala Cys Ser Thr 
230 

Arg Ser Glu Asn 
245 

Leu Lys Glu Lys 



Lys Leu Phe Val 
25 

Arg Tyr He Val 



Gly Leu Arg Val 
60 

Thr Lys Glu Ala 
75 

Phe Glu Ala Leu 
90 

His Val Cys Val 
105 

Lys Val Thr Lys 



Asp Thr Pro He 
140 

Ala Met Thr Gly 
155 

Leu Leu Leu Ala 
170 

Tyr Thr Ser Gly 
185 

Gly Gly Val Gly 



Ser Tyr Phe Gin 
220 

Pro Tyr Asn Ala 
235 

Pro Val He Leu 
250 

lie Pro Asp Glu 



Ala Pro Ser Arg 
30 

Ala Gly Ser Asp 
45 

Arg His Ser Gin 



Asp Thr Ser Ala 
80 

Gin Glu Gly Leu 
95 

Met Gly Glu Asp 
110 

Gly Leu Ala Asp 
125 

Cys Glu Asn Ala 



Leu Arg Pro Val 
160 

Phe Asn Gin He 
175 

Gly Gin Phe Thr 
190 

Arg Gin Leu Gly 
205 

Ser He Pro Gly 



Lys Gly Leu Met 
240 

Phe Glu His Val 
255 

Asp Tyr He Cys 



260 



265 



270 



Asn Leu Glu Glu Ala Glu Met Val Arg Pro Gly Glu His lie Thr He 
275 280 285 

Leu Thr Tyr Ser Arg Met Arg Tyr His Val Met Gin Ala Ala Lys Thr 
290 295 300 

Leu Val Asn Lys Gly Tyr Asp Pro Glu Val He Asp He Arg Ser Leu 
305 310 315 320 

Lys Pro Phe Asp Leu His Thr He Gly Asn Ser Val Lys Lys Thr His 
325 330 335 

Arg Val Leu He Val Glu Glu Cys Met Arg Thr Gly Gly He Gly Ala 
340 345 350 

Ser Leu Thr Ala Ala He Asn Glu Asn Phe His Asp Tyr Leu Asp Ala 
355 360 365 

Pro Val Met Cys Leu Ser Ser Gin Asp Val Pro Thr Pro Tyr Ala Gly 
370 375 380 

Thr Leu Glu Glu Trp Thr Val Val Gin Pro Ala Gin He Val Thr Ala 
385 390 395 400 

Val Glu Gin Leu Cys Gin 
405 



<210> 5 
<211> 1708 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 5 

cgtccacttc actctctcta aactctctct cagatctctc tctctctgtg attcaacaat 60 
ggcggtttct tcttcttcgt ttctatcgac agcttcacta accaattcca aatccaacat 120 
ttcattcgct tcctcagtat ccccatccct ccgcagcgtc gttttccgct ccacgactcc 180 
ggcgacttct caccgtcgtt caatgacggt ccgatctaag attcgtgaaa ttttcatgcc 240 
ggcgttatca tcaaccatga cggaaggcaa aatcgtgtca tggatcaaaa cagaaggcga 300 
gaaactcgcc aagggagaga gtgttgtggt tgttgaatct gataaagccg atatggatgt 360 
agaaacgttt tacgatggtt atcttgctgc gattgtcgtc ggagaaggtg aaacagctcc 42 0 
ggttggtgct gcgattggat tgttagctga gactgaagct gagatcgaag aagctaagag 480 
taaagccgct tcgaaatctt cttcttctgt ggctgaggct gtcgttccat ctcctcctcc 540 
ggttacttct tctcctgctc cggcgattgc tcaaccggct ccggtgacgg cagtatcaga 600 
tggtccgagg aagactgttg cgacgccgta tgctaagaag cttgctaaac aacacaaggt 660 
tgatattgaa tccgttgctg gaactggacc attcggtagg attacggctt ctgatgtgga 720 



6 



gacggcggct ggaattgctc cgtccaaatc 
tccggtgacg gctaaagcaa ccaccactaa 
tgttcctttc acagcaatgc aatctgcagt 
tcctacattc cgtgttggtt atcctgtgaa 
ggtgaagcca aagggtgtaa caatgacagc 
ggctcagcat cctgtggtga acgctagctg 
tagcattaac attgcagtgg cggttgctat 
agatgcagat aagttggatt tgtacttgtt 
agctagaagc aagcaacttc aaccccatga 
tctcggtatg tttggagtgg atagatttga 
tatggctgtt ggagcgtcaa agccaactgt 
aaaaaacaca atgctggtga atgtgactgc 
ggctgctttt ctccaaacct ttgcaaagat 
agacgccaag cgaagacgag aagtcaaaaa 
cccaagtaaa ttttttaacc tcaatgttct 
cctcacttgg gttgtaccgg tatttggttt 
taatttccaa ccaaaaaaaa aaaaaaaa 



ctccatcgca ccaccgcctc ctcctccacc 780 
tttgcctcct ctgttacctg attcaagcat 840 
atctaagaac atgattgaga gtctctctgt 90 0 
cactgacgct cttgatgcac tttacgagaa 960 
tttattagct aaagctgcag ggatggcctt 102 0 
caaagacggg aagagtttta gttacaatag 10 8 0 
caatggtggc ctgattacgc ctgttctaca 114 0 
atctcaaaaa tggaaagagc tggtggggaa 12 00 
atacaactct ggaactttta ctttatcgaa 1260 
cgctattctt ccgccaggac agggtgctat 132 0 
agttgctgat aaggatggat tcttcagtgt 1380 
agatcatcgc attgtgtatg gagctgactt 144 0 
cattgagaat ccagatagtt tgaccttata 1500 
cagtttccaa aattcctgag ccaaattttt 1560 
tgggcttgcc caacttcttt tgcatctttt 1620 
caagaatcac cattttgggg ttttaacaaa 1680 
1708 



<210> 6 

<211> 480 

<212> PRT 

<213> Arabidopsis thaliana 



<400> 6 

Met Ala Val Ser 
1 

Ser Lys Ser Asn 
20 

Ser Val Val Phe 
35 

Met Thr Val Arg 
50 

Ser Thr Met Thr 
65 

Glu Lys Leu Ala 



Ala Asp Met Asp 
100 

Val Val Gly Glu 
115 



Ser Ser Ser Phe 



lie Ser Phe Ala 



Arg Ser Thr Thr 
40 

Ser Lys He Arg 
55 

Glu Gly Lys He 
70 

Lys Gly Glu Ser 
85 

Val Glu Thr Phe 



Gly Glu Thr Ala 
120 



Leu Ser Thr Ala 
10 

Ser Ser Val Ser 
25 

Pro Ala Thr Ser 



Glu He Phe Met 
60 

Val Ser Trp He 
75 

Val Val Val Val 
90 

Tyr Asp Gly Tyr 
105 

Pro val Gly Ala 



Ser Leu Thr Asn 
15 

Pro Ser Leu Arg 
30 

His Arg Arg Ser 
45 

Pro Ala Leu Ser 



Lys Thr Glu Gly 
80 

Glu Ser Asp Lys 
95 

Leu Ala Ala He 
110 

Ala He Gly Leu 
125 



7 



Leu Ala Glu Thr Glu 
130 

Ser Lys Ser Ser Ser 
145 

Pro Val Thr Ser Ser 
165 

Thr Ala Val Ser Asp 
180 

Lys Lys Leu Ala Lys 
195 

Thr Gly Pro Phe Gly 
210 

Gly lie Ala Pro Ser 
225 

Pro Pro Val Thr Ala 
245 

Pro Asp Ser Ser lie 
260 

Lys Asn Met lie Glu 
275 

Pro Val Asn Thr Asp 
290 

Lys Gly Val Thr Met 
305 

Leu Ala Gin His Pro 
325 

Phe Ser Tyr Asn Ser 
340 

Gly Gly Leu lie Thr 
355 

Tyr Leu Leu Ser Gin 
370 



Ala Glu He Glu Glu Ala 
135 

Ser Val Ala Glu Ala Val 
150 155 

Pro Ala Pro Ala lie Ala 
170 

Gly Pro Arg Lys Thr Val 
185 

Gin His Lys Val Asp He 
200 

Arg He Thr Ala Ser Asp 
215 

Lys Ser Ser He Ala Pro 
230 235 

Lys Ala Thr Thr Thr Asn 
250 

Val Pro Phe Thr Ala Met 
265 

Ser Leu Ser Val Pro Thr 
280 

Ala Leu Asp Ala Leu Tyr 
295 

Thr Ala Leu Leu Ala Lys 
310 315 

Val Val Asn Ala Ser Cys 
330 

Ser He Asn He Ala Val 
345 

Pro Val Leu Gin Asp Ala 
360 

Lys Trp Lys Glu Leu Val 
375 



Lys Ser Lys Ala Ala 
140 

Val Pro Ser Pro Pro 
160 

Gin Pro Ala Pro Val 
175 

Ala Thr Pro Tyr Ala 
190 

Glu Ser Val Ala Gly 
205 

Val Glu Thr Ala Ala 
220 

Pro Pro Pro Pro Pro 
240 

Leu Pro Pro Leu Leu 
255 

Gin Ser Ala Val Ser 
270 

Phe Arg Val Gly Tyr 
285 

Glu Lys Val Lys Pro 
300 

Ala Ala Gly Met Ala 
320 

Lys Asp Gly Lys Ser 
335 

Ala Val Ala He Asn 
350 

Asp Lys Leu Asp Leu 
365 

Gly Lys Ala Arg Ser 
380 



Lys Gin Leu Gin 
385 

Asn Leu Gly Met 



Gly Gin Gly Ala 
420 

Ala Asp Lys Asp 
435 

Val Thr Ala Asp 
450 

Leu Gin Thr Phe 
465 



Pro His Glu Tyr 
390 

Phe Gly Val Asp 
405 

lie Met Ala Val 



Gly Phe Phe Ser 
440 

His Arg lie Val 
455 

Ala Lys He He 
470 



Asn Ser Gly Thr 
395 

Arg Phe Asp Ala 
410 

Gly Ala Ser Lys 
425 

Val Lys Asn Thr 



Tyr Gly Ala Asp 
460 

Glu Asn Pro Asp 
475 



Phe Thr Leu Ser 
400 

He Leu Pro Pro 
415 

Pro Thr Val Val 
430 

Met Leu Val Asn 
445 

Leu Ala Ala Phe 



Ser Leu Thr Leu 
480 



<210> 7 
<211> 25 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 7 

cggtaccaag tctgactctg tcgtt 25 



<210> 8 
<211> 26 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 8 

ccttcgaagg ttccatctcc gaaaaa 2 6 



<210> 9 
<211> 25 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 9 

cggtaccttc gaggctcttc aggaa 25 



9 



<212> DMA 

<213> Arabidopsis thaliana 
<400> 10 

ccttcgaacg ggccttagac cagt 24 



<210> 11 
<211> 1587 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 11 

gggcgatctg gtttgctaga tccaaaaccc 
tgtcgacaat tctcataaaa cgtgattact 
agttatcttc aacggcgtat ttgagtccct 
tggagacaca ggctgatcat ttggttcagc 
atttcccagg aggcaaagtc ggttacacat 
caaggaggat tccatgttac cgggttcttg 
attttattcc ggtgagtgag aaactcgctg 
aagtaatgga tcacatcttc tacgaagctc 
cttccgtcgg agaagaagcc attaacatcg 
tcgttttacc tcagtaccga gaacctggag 
agtttgctaa tcagtgtttt gggaacaaag 
ttcattacgg ttccaatcgt cttaattact 
ttcctcaagc tgctggagtt ggttattctt 
ttacattcat cggagatggt ggcacaagcg 
cggccgtaat ggaagctccg gttgtgttta 
ctcatatctc agaacagttt agaagtgatg 
tcccgaagca tcccgtgtgg gacggtaccg 
cagctcgaga aatggctgta acagaacaaa 
gagtaggaca tcattctaca tcagatgatt 
agtactggaa aatgtcgaga aaccctgtga 
gatggtggag tgaggaagat gaatccaagc 
aagcgattca ggctgcggag aagtgggaga 
tatatgatgt taaaccgaag aacctagaag 
agaaacaacc tcaagattat cctcctggct 
ttaaaatacc tcgcggaccg cgaattcgat 
tgtccacgta tcgaatagta atcaagtatc 
aaaaaaaaaa aaaaaaaaaa aaaaaaa 



ttgtttctag cttgagacat aatctaaatt 60 
ctcatcgtcc catcttctat acaacttctc 120 
tcggtagcct ccgtcatgag tctacggccg 180 
agattgatga agtcgatgcc caggaactgg 240 
cggagatgaa attcataccg gaatcatctt 300 
acgaagacgg acgaatcatc cccgatagcg 3 60 
ttagaatgta cgaacaaatg gcgacgctac 42 0 
aacgtcaagg aagaatatct ttttatctta 480 
cttcagcagc tgctctcagt cctgacgacg 54 0 
ttcttttgtg gcgtggcttc acgttggagg 600 
ctgattatgg caaaggcaga caaatgccaa 660 
tcactatctc ctctccaatt gccacgcaac 72 0 
tgaaaatgga caagaagaat gcttgtactg 7 80 
agggagattt tcacgccgga ttgaattttg 84 0 
tatgtcggaa caacggttgg gcgattagta 900 
gaatagttgt gaaaggtcaa gcttacggta 960 
atgcacttgc ggtttatagt gctgtacgct 102 0 
gacctgttct cattgagatg atgacatata 1080 
caactaagta cagggcggcg gatgaaatcc 1140 
atagatttcg gaaatgggtc gaagataacg 12 00 
taagatctaa cgcaagaaaa cagcttctgc 12 60 
aacaaccatt gacagagttg tttaacgatg 1320 
agcaagaact tggtttgaag gaattagtaa 13 80 
ttcatgtttg aatctagagg aactgtgtgg 1440 
atcaagcttc tcattgcaga ctatttatat 1500 
aatgtagaga ccagcatttg gagcatcaaa 1560 
1587 



<210> 12 



<212> PRT 

<213> Arabidopsis thaliana 
<400> 12 

Ala lie Trp Phe Ala Arg Ser Lys Thr Leu Val Ser Ser Leu Arg His 
15 10 15 

Asn Leu Asn Leu Ser Thr lie Leu lie Lys Arg Asp Tyr Ser His Arg 
20 25 30 

Pro lie Phe Tyr Thr Thr Ser Gin Leu Ser Ser Thr Ala Tyr Leu Ser 
35 40 45 

Pro Phe Gly Ser Leu Arg His Glu Ser Thr Ala Val Glu Thr Gin Ala 
50 55 60 

Asp His Leu Val Gin Gin lie Asp Glu Val Asp Ala Gin Glu Leu Asp 
65 70 75 80 

Phe Pro Gly Gly Lys Val Gly Tyr Thr Ser Glu Met Lys Phe lie Pro 
85 90 95 

Glu Ser Ser Ser Arg Arg lie Pro Cys Tyr Arg Val Leu Asp Glu Asp 
100 105 110 

Gly Arg lie lie Pro Asp Ser Asp Phe lie Pro Val Ser Glu Lys Leu 
115 120 125 

Ala Val Arg Met Tyr Glu Gin Met Ala Thr Leu Gin Val Met Asp His 
130 135 140 

lie Phe Tyr Glu Ala Gin Arg Gin Gly Arg lie Ser Phe Tyr Leu Thr 
145 150 155 160 

Ser Val Gly Glu Glu Ala lie Asn He Ala Ser Ala Ala Ala Leu Ser 
165 170 175 

Pro Asp Asp Val Val Leu Pro Gin Tyr Arg Glu Pro Gly Val Leu Leu 
180 185 190 

Trp Arg Gly Phe Thr Leu Glu Glu Phe Ala Asn Gin Cys Phe Gly Asn 
195 200 205 

Lys Ala Asp Tyr Gly Lys Gly Arg Gin Met Pro He His Tyr Gly Ser 
210 215 220 

Asn Arg Leu Asn Tyr Phe Thr He Ser Ser Pro He Ala Thr Gin Leu 
225 230 235 240 
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Pro Gin Ala Ala Gly Val Gly Tyr Ser Leu Lys Met Asp Lys Lys Asn 
245 250 255 



Ala Cys Thr Val Thr Phe lie Gly Asp Gly Gly Thr Ser Glu Gly Asp 
260 265 270 

Phe His Ala Gly Leu Asn Phe Ala Ala Val Met Glu Ala Pro Val Val 
275 280 285 

Phe lie Cys Arg Asn Asn Gly Trp Ala He Ser Thr His He Ser Glu 
290 295 300 

Gin Phe Arg Ser Asp Gly He Val Val Lys Gly Gin Ala Tyr Gly He 
305 310 315 320 

Pro Lys His Pro Val Trp Asp Gly Thr Asp Ala Leu Ala Val Tyr Ser 
325 330 335 

Ala Val Arg Ser Ala Arg Glu Met Ala Val Thr Glu Gin Arg Pro Val 
340 345 350 

Leu He Glu Met Met Thr Tyr Arg Val Gly His His Ser Thr Ser Asp 
355 360 365 

Asp Ser Thr Lys Tyr Arg Ala Ala Asp Glu He Gin Tyr Trp Lys Met 
370 375 380 

Ser Arg Asn Pro Val Asn Arg Phe Arg Lys Trp Val Glu Asp Asn Gly 
385 390 395 400 

Trp Trp Ser Glu Glu Asp Glu Ser Lys Leu Arg Ser Asn Ala Arg Lys 
405 410 415 

Gin Leu Leu Gin Ala He Gin Ala Ala Glu Lys Trp Glu Lys Gin Pro 
420 425 430 

Leu Thr Glu Leu Phe Asn Asp Val Tyr Asp Val Lys Pro Lys Asn Leu 
435 440 445 

Glu Glu Gin Glu Leu Gly Leu Lys Glu Leu Val Lys Lys Gin Pro Gin 
450 455 460 

Asp Tyr Pro Pro Gly Phe His Val 
465 470 



<210> 13 



12 



<212> DNA 

<213> Arabidopsis thaliana 



<400> 13 

ttcttcaccc accaaaagta gcaaaccttt gccacctaaa aatcttacca gttgggtgaa 60 
agttgccaaa atagagcttg cttttgtcgc aatcctatat ttttcagatt gattgttggt 12 0 
gggtttgtgt aaatggcggc tcttttaggc agatcctgcc ggaaactgag ttttccgagc 18 0 
ttgactcacg gagctaggag ggtatcgacg gaaactggaa aaccattgaa tctatactct 240 
gctattaatc aagcgcttca catcgctttg gacaccgatc ctcggtctta tgtctttggg 300 
gaagacgttg gctttggtgg agtctttcgc tgtacaactg gtttagctga acgattcggg 360 
aaaaaccgtg tcttcaatac tcctctttgt gagcagggca ttgttggatt tggcattggt 42 0 
ctagcagcaa tgggtaatcg agcaattgta gagattcagt ttgcagatta tatatatcct 480 
gcttttgatc agattgttaa tgaagctgca aagttcagat accgaagtgg taaccaattc 54 0 
aactgtggag gacttacgat aagagcacca tatggagcag ttggtcatgg tggacattac 60 0 
cattcacaat cccctgaagc tttcttttgc catgtccctg gtattaaggt tgttatccct 660 
cggagtccac gagaagcaaa gggactgttg ttgtcatgta tccgtgatcc aaatcccgtt 72 0 
gttttcttcg aaccaaagtg gctgtatcgt caagcagtag aagaagtccc tgagcatgac 78 0 
tatatgatac ctttatcaga agcagaggtt ataagagaag gcaatgacat tacactggtt 84 0 
ggatggggag ctcagcttac cgttatggaa caagcttgtc tggacgcgga aaaggaagga 90 0 
atatcatgtg aactgataga tctcaagaca ctgcttcctt gggacaaaga aaccgttgag 960 
gcttcagtta aaaagactgg cagacttctt ataagccatg aagctcctgt aacaggaggt 1020 
tttggagcag agatctctgc aacaattctg gaacgttgct ttttgaagtt agaagctcca 1080 
gtaagcagag tttgtggtct ggatactcca tttcctcttg tgtttgaacc attctacatg 1140 
cccaccaaga acaagatatt ggatgcaatc aaatcgactg tgaattacta gccgtactat 12 00 
ctgtagttta ctgtttacac taggactaat gtaatcgcat gtctttgtta tcaattcgtc 1260 
taatgtaaca ctaccgatta actttaatga atttcaagat aacgaaaaaa aaaaaaaaa 1319 



<210> 14 
<211> 352 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 14 

Met Ala Ala Leu Leu Gly Arg Ser 
1 5 

Leu Thr His Gly Ala Arg Arg Val 
20 

Asn Leu Tyr Ser Ala lie Asn Gin 
35 40 

Asp Pro Arg Ser Tyr Val Phe Gly 
50 55 

Phe Arg Cys Thr Thr Gly Leu Ala 



Cys Arg Lys Leu Ser Phe Pro Ser 
10 15 

Ser Thr Glu Thr Gly Lys Pro Leu 
25 30 

Ala Leu His lie Ala Leu Asp Thr 
45 

Glu Asp Val Gly Phe Gly Gly Val 
60 

Glu Arg Phe Gly Lys Asn Arg Val 
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65 70 75 80 

Phe Asn Thr Pro Leu Cys Glu Gin Gly lie Val Gly Phe Gly He Gly 
85 90 95 

Leu Ala Ala Met Gly Asn Arg Ala He Val Glu He Gin Phe Ala Asp 
100 105 110 

Tyr He Tyr Pro Ala Phe Asp Gin He Val Asn Glu Ala Ala Lys Phe 
115 120 125 

Arg Tyr Arg Ser Gly Asn Gin Phe Asn Cys Gly Gly Leu Thr He Arg 
130 135 140 

Ala Pro Tyr Gly Ala Val Gly His Gly Gly His Tyr His Ser Gin Ser 
145 150 155 160 

Pro Glu Ala Phe Phe Cys His Val Pro Gly He Lys Val Val He Pro 
165 170 175 

Arg Ser Pro Arg Glu Ala Lys Gly Leu Leu Leu Ser Cys He Arg Asp 
180 185 190 

Pro Asn Pro Val Val Phe Phe Glu Pro Lys Trp Leu Tyr Arg Gin Ala 
195 200 205 

Val Glu Glu Val Pro Glu His Asp Tyr Met He Pro Leu Ser Glu Ala 
210 215 220 

Glu Val He Arg Glu Gly Asn Asp He Thr Leu Val Gly Trp Gly Ala 
225 230 235 240 

Gin Leu Thr Val Met Glu Gin Ala Cys Leu Asp Ala Glu Lys Glu Gly 
245 250 255 

He Ser Cys Glu Leu He Asp Leu Lys Thr Leu Leu Pro Trp Asp Lys 
260 265 270 

Glu Thr Val Glu Ala Ser Val Lys Lys Thr Gly Arg Leu Leu He Ser 
275 280 285 

His Glu Ala Pro Val Thr Gly Gly Phe Gly Ala Glu He Ser Ala Thr 
290 295 300 

He Leu Glu Arg Cys Phe Leu Lys Leu Glu Ala Pro Val Ser Arg Val 
305 310 315 320 

Cys Gly Leu Asp Thr Pro Phe Pro Leu Val Phe Glu Pro Phe Tyr Met 
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Pro Thr Lys Asn Lys lie Leu Asp Ala lie Lys Ser Thr Val Asn Tyr 
340 34S 350 



<210> 15 
<211> 1450 
<212> DNA 

<213> Arabidopsis thaliana 



<400> 15 

agaaacaaac acacggacca 
caccggtttc tccgcccatt 
gagtatcttt ctcagtcgtc 
actttgatga aatggggtgg 
gattcaaatt cagggttaat 
tgtgagcttc tcaagtggtt 
tgtgaagttc agagcgataa 
gctctgattt cacattctcc 
gcggttgaag actcgcagga 
ggaggttcaa agcagggaac 
aaccttgcaa aagaccttgg 
agagttttga aagaggatgt 
gtttcttctg agcatgctgt 
tttgaagata aaacagttcc 
atggctacaa gtgtaccgca 
gagctcaagc agttcttcaa 
cctactttaa tcaagtctct 
ttcaacgcgg aatctctcga 
gccactgaac atggccttgt 
gagataacca aagagctgtc 
gaggatgtga ctggtggaac 
ggatcccttc ttttaaactt 
gttccaaaat tctcaaaaga 
gctgcggatc atagagttct 
gagtatgtcg 



accgttcata acaatgatcg 
cagctcgtca tctgtttgct 
ttcctctccg gcgtcgcgcc 
aggaagtaga agctggtttt 
tgatgtgcca ctagctcaaa 
tgtcaaagag ggagattctg 
agcaactata gagatcacaa 
aggtgacatt attaaggttg 
ttcgcttcta accactgata 
agaaaatctt cttggagctc 
catagatatc aatgttataa 
tctccggttt agtgaccaga 
tataggagga gactcggttt 
tctaagggga ttcagccgag 
ttttcatttt gttgaagaga 
agagaacaat acagattcca 
gtcaatggct ctaaccaaat 
gatcattctc aaaggttcac 
cgttcctaat ataaagaatg 
ccggttacaa catttggcag 
cataactctg agtaacattg 
accggaagtt gcaatcatcg 
aggaactgtc tatcctgcat 
agatggggca acggtagctc 



cgcgacggat ctggcgaagc 60 
ctccgccgtt ccgggtaccg 12 0 
cattctttgt tcaccctccc 180 
cgaacgaagc catggccact 240 
ctggggaagg tattgctgaa 3 00 
tggaagagtt tcagccactc 3 60 
gtcgttttaa agggaaagtg 42 0 
gagagactct ggttaggttg 480 
gttcagaaat tgtaactctg 54 0 
tctcaacgcc tgcggttcgt 60 0 
ctggaactgg taaagatggt 660 
aaggatttgt aacagattca 72 0 
ccactaaagc tagtagtaac 78 0 
caatggtcaa gacaatgact 84 0 
taaactgcga ctcacttgtg 90 0 
ccatcaaaca cacttttctt 960 
atcccttcgt gaatagttgc 1020 
ataatattgg agttgcaatg 1080 
ttcagtcatt atctctgcta 1140 
caaacaacaa acttaacccc 1200 
gagcaattgg tggtaaattc 12 60 
ttcttggaag aatcgagaaa 132 0 
cgataatgat ggttaacatt 13 8 0 
ggttttgctg ccagtggaaa 144 0 
1450 



<210> 16 
<211> 483 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 16 
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Met lie Ala Arg Arg lie Trp Arg Ser His Arg Phe Leu Arg Pro Phe 
15 10 15 



Ser Ser Ser Ser Val Cys Ser Pro Pro Phe Arg Val Pro Glu Tyr Leu 
20 25 30 

Ser Gin Ser Ser Ser Ser Pro Ala Ser Arg Pro Phe Phe Val His Pro 
35 40 45 

Pro Thr Leu Met Lys Trp Gly Gly Gly Ser Arg Ser Trp Phe Ser Asn 
50 55 60 

Glu Ala Met Ala Thr Asp Ser Asn Ser Gly Leu lie Asp Val Pro Leu 
65 70 75 80 

Ala Gin Thr Gly Glu Gly lie Ala Glu Cys Glu Leu Leu Lys Trp Phe 



Val Lys Glu Gly Asp Ser Val Glu Glu Phe Gin Pro Leu Cys Glu Val 
100 105 110 

Gin Ser Asp Lys Ala Thr lie Glu lie Thr Ser Arg Phe Lys Gly Lys 
115 120 125 

Val Ala Leu lie Ser His Ser Pro Gly Asp lie lie Lys Val Gly Glu 
130 135 140 

Thr Leu Val Arg Leu Ala Val Glu Asp Ser Gin Asp Ser Leu Leu Thr 
145 150 155 160 

Thr Asp Ser Ser Glu lie Val Thr Leu Gly Gly Ser Lys Gin Gly Thr 
165 170 175 

Glu Asn Leu Leu Gly Ala Leu Ser Thr Pro Ala Val Arg Asn Leu Ala 
180 185 190 

Lys Asp Leu Gly lie Asp lie Asn Val lie Thr Gly Thr Gly Lys Asp 
195 200 205 

Gly Arg Val Leu Lys Glu Asp Val Leu Arg Phe Ser Asp Gin Lys Gly 
210 215 220 

Phe Val Thr Asp Ser Val Ser Ser Glu His Ala Val lie Gly Gly Asp 
225 230 235 240 

Ser Val Ser Thr Lys Ala Ser Ser Asn Phe Glu Asp Lys Thr Val Pro 
245 250 255 
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Leu Arg Gly Phe Ser Arg Ala Met Val Lys Thr Met Thr Met Ala Thr 
260 265 270 



Ser Val Pro His Phe His Phe Val Glu Glu lie Asn Cys Asp Ser Leu 
275 280 285 

Val Glu Leu Lys Gin Phe Phe Lys Glu Asn Asn Thr Asp Ser Thr lie 
290 295 300 

Lys His Thr Phe Leu Pro Thr Leu lie Lys Ser Leu Ser Met Ala Leu 
305 310 315 320 

Thr Lys Tyr Pro Phe Val Asn Ser Cys Phe Asn Ala Glu Ser Leu Glu 
325 330 335 

lie lie Leu Lys Gly Ser His Asn lie Gly Val Ala Met Ala Thr Glu 
340 345 350 

His Gly Leu Val Val Pro Asn lie Lys Asn Val Gin Ser Leu Ser Leu 
355 360 365 

Leu Glu lie Thr Lys Glu Leu Ser Arg Leu Gin His Leu Ala Ala Asn 
370 375 380 

Asn Lys Leu Asn Pro Glu Asp Val Thr Gly Gly Thr lie Thr Leu Ser 
385 390 395 400 

Asn lie Gly Ala lie Gly Gly Lys Phe Gly Ser Leu Leu Leu Asn Leu 
405 410 415 

Pro Glu Val Ala lie lie Val Leu Gly Arg He Glu Lys Val Pro Lys 
420 425 430 

Phe Ser Lys Glu Gly Thr Val Tyr Pro Ala Ser He Met Met Val Asn 
435 440 445 

lie Ala Ala Asp His Arg Val Leu Asp Gly Ala Thr Val Ala Arg Phe 
450 455 460 

Cys Cys Gin Trp Lys Glu Tyr Val Glu Lys Pro Glu Leu Leu Met Leu 
465 470 475 480 

Gin Met Arg 



<210> 
<211> 



<212> DNA 

<213> Arabidopsis thaliana 
<400> 17 

gggccccata tggcgacggc tttc 



<210> 18 
<211> 26 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 18 

ggggcggccg ctaataacca cctaac 



<210> 19 
<211> 33 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 19 

gggcccgcgg ccgctgatca tttggttcag cag 



<210> 20 
<211> 33 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 20 

gggcccgcgg ccgctgatca tttggttcag cag 



<210> 21 

<211> 30 

<212> DNA 

<213> Arabidopsis thaliana 



gggcccgtcg actcaaacat gaaagccagg 



<210> 22 

<211> 24 

<212> DNA 

<213> Arabidopsis 



thaliana 
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<400> 22 

gggccccata tgtcttcgat aatc 



<210> 23 
<211> 27 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 23 

gggcccctcg agaccttcct gaagagc 



<210> 24 
<211> 27 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 24 

gggcccctcg agaccttcct gaagagc 



<210> 25 
<211> 33 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 25 

gggcccgaat tctcattact agtaattcac agt 



<210> 26 

<211> 24 

<212> DNA 

<213> Arabidopsis thaliana 



<400> 26 

gggccccata tggcggtttc ttct 



<210> 27 
<211> 28 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 27 

gggcccccat ggcaatttca ggattctt 
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<210> 28 
<211> 24 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 28 

gggccccata tgtcttcgat aatc 



<210> 29 
<211> 26 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 29 

gggcccccat ggcgacggct ttcgct 



<210> 30 
<211> 31 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 30 

gggccctgat catattattg gtggattgct t 



<210> 31 
<211> 27 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 31 

gggcccctcg agatcgcttt ggacacc 



<210> 32 
<211> 32 
<212> DNA 

<213> Arabidopsis thaliana 
<400> 32 

gggcccgcgg ccgcattatt ggtggattgc tt 



<210> 33 
<211> 428 
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<212> PRT 

<213> Arabidopsis thaliana 



<400> 33 

Met Ala Thr Ala Phe Ala Pro Thr Lys Leu Thr Ala Thr Val Pro Leu 



His Gly Ser His Glu Asn Arg Leu Leu Leu Pro lie Arg Leu Ala Pro 



Pro Ser Ser Phe Leu Gly Ser Thr Arg Ser Leu Ser Leu Arg Arg Leu 



Asn His Ser Asn Ala Thr Arg Arg Ser Pro Val Val Ser Val Gin Glu 



Val Val Lys Glu Lys Gin Ser Thr Asn Asn Thr Ser Leu Leu lie Thr 



Lys Glu Glu Gly Leu Glu Leu Tyr Glu Asp Met lie Leu Gly Arg Ser 



Phe Glu Asp Met Cys Ala Gin Met Tyr Tyr Arg Gly Lys Met Phe Gly 
100 105 110 

Phe Val His Leu Tyr Asn Gly Gin Glu Ala Val Ser Thr Gly Phe He 
115 120 125 

Lys Leu Leu Thr Lys Ser Asp Ser Val Val Ser Thr Tyr Arg Asp His 
130 135 140 

Val His Ala Leu Ser Lys Gly Val Ser Ala Arg Ala Val Met Ser Glu 
145 150 155 160 

Leu Phe Gly Lys Val Thr Gly Cys Cys Arg Gly Gin Gly Gly Ser Met 
165 170 175 

His Met Phe Ser Lys Glu His Asn Met Leu Gly Gly Phe Ala Phe He 
180 185 190 

Gly Glu Gly He Pro Val Ala Thr Gly Ala Ala Phe Ser Ser Lys Tyr 
195 200 205 

Arg Arg Glu Val Leu Lys Gin Asp Cys Asp Asp Val Thr Val Ala Phe 
210 215 220 

Phe Gly Asp Gly Thr Cys Asn Asn Gly Gin Phe Phe Glu Cys Leu Asn 
225 230 235 240 
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Met Ala Ala Leu Tyr Lys Leu Pro lie lie Phe Val Val Glu Asn Asn 
245 250 255 



Leu Trp Ala lie Gly Met Ser His Leu Arg Ala Thr Ser Asp Pro Glu 
260 265 270 

He Trp Lys Lys Gly Pro Ala Phe Gly Met Pro Gly Val His Val Asp 
275 280 285 

Gly Met Asp Val Leu Lys Val Arg Glu Val Ala Lys Glu Ala Val Thr 
290 295 300 

Arg Ala Arg Arg Gly Glu Gly Pro Thr Leu Val Glu Cys Glu Thr Tyr 
305 310 315 320 

Arg Phe Arg Gly His Ser Leu Ala Asp Pro Asp Glu Leu Arg Asp Ala 
325 330 335 

Ala Glu Lys Ala Lys Tyr Ala Ala Arg Asp Pro lie Ala Ala Leu Lys 
340 345 350 

Lys Tyr Leu He Glu Asn Lys Leu Ala Lys Glu Ala Glu Leu Lys Ser 
355 360 365 

He Glu Lys Lys He Asp Glu Leu Val Glu Glu Ala Val Glu Phe Ala 
370 375 380 

Asp Ala Ser Pro Gin Pro Gly Arg Ser Gin Leu Leu Glu Asn Val Phe 
385 390 395 400 

Ala Asp Pro Lys Gly Phe Gly He Gly Pro Asp Gly Arg Tyr Arg Cys 
405 410 415 

Glu Asp Pro Lys Phe Thr Glu Gly Thr Ala Gin Val 
420 425 



<210> 34 

<211> 344 

<212> PRT 

<213> P. purpurea 

<400> 34 

Met Ser Tyr Pro Lys Lys Val Glu Leu Pro Leu Thr Asn Cys Asn Gin 
15 10 15 

He Asn Leu Thr Lys His Lys Leu Leu Val Leu Tyr Glu Asp Met Leu 
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Leu Gly Arg Asn Phe Glu Asp Met Cys Ala Gin Met Tyr Tyr Lys Gly 
35 40 45 

Lys Met Phe Gly Phe Val His Leu Tyr Asn Gly Glu Glu Ala Val Ser 
50 55 60 

Thr Gly Val He Lys Leu Leu Asp Ser Lys Asp Tyr Val Cys Ser Thr 
65 70 75 80 

Tyr Arg Asp His Val His Ala Leu Ser Lys Gly Val Pro Ser Gin Asn 
85 90 95 

Val Met Ala Glu Leu Phe Gly Lys Glu Thr Gly Cys Ser Arg Gly Arg 
100 105 110 

Gly Gly Ser Met His He Phe Ser Ala Pro His Asn Phe Leu Gly Gly 
115 120 125 

Phe Ala Phe He Ala Glu Gly He Pro Val Ala Thr Gly Ala Ala Phe 
130 135 140 

Gin Ser He Tyr Arg Gin Gin Val Leu Lys Glu Pro Gly Glu Leu Arg 
145 150 155 160 

Val Thr Ala Cys Phe Phe Gly Asp Gly Thr Thr Asn Asn Gly Gin Phe 
165 170 175 

Phe Glu Cys Leu Asn Met Ala Val Leu Trp Lys Leu Pro He He Phe 
180 185 190 

Val Val Glu Asn Asn Gin Trp Ala He Gly Met Ala His His Arg Ser 
195 200 205 

Ser Ser He Pro Glu He His Lys Lys Ala Glu Ala Phe Gly Leu Pro 
210 215 220 

Gly He Glu Val Asp Gly Met Asp Val Leu Ala Val Arg Gin Val Ala 
225 230 235 240 

Glu Lys Ala Val Glu Arg Ala Arg Gin Gly Gin Gly Pro Thr Leu He 
245 250 255 

Glu Ala Leu Thr Tyr Arg Phe Arg Gly His Ser Leu Ala Asp Pro Asp 
260 265 270 

Glu Leu Arg Ser Arg Gin Glu Lys Glu Ala Trp Val Ala Arg Asp Pro 



275 



280 



285 



lie Lys Lys Leu Lys Lys His lie 
290 295 

Asp Glu Leu Asn Asp lie Gin Ser 
305 310 

Ser Val Glu Phe Ala Met Ser Ser 
325 

Lys Arg Tyr Leu Phe Ala Asp Asn 
340 



Leu Asp Asn Gin lie Ala Ser Ser 
300 

Ser Val Lys lie Asp Leu Glu Gin 
315 320 

Pro Glu Pro Asn lie Ser Glu Leu 
330 335 



<210> 35 
<211> 389 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 35 

Met Ala Leu Ser Arg Leu Ser Ser Arg Ser Asn lie lie Thr Arg Pro 



Phe Ser Ala Ala Phe Ser Arg Leu lie Ser Thr Asp Thr Thr Pro lie 



Thr lie Glu Thr Ser Leu Pro Phe Thr Ala His Leu Cys Asp Pro Pro 
35 40 45 

Ser Arg Ser Val Glu Ser Ser Ser Gin Glu Leu Leu Asp Phe Phe Arg 



Thr Met Ala Leu Met Arg Arg Met Glu He Ala Ala Asp Ser Leu Tyr 



Lys Ala Asn Val He Arg Gly Phe Cys His Leu Tyr Asp Gly Gin Glu 



Ala Val Ala He Gly Met Glu Ala Ala He Thr Lys Lys Asp Ala He 
100 105 110 

He Thr Ala Tyr Arg Asp 'His Cys He Phe Leu Gly Arg Gly Gly Ser 
115 120 125 

Leu His Glu Val Phe Ser Glu Leu Met Gly Arg Gin Ala Gly Cys Ser 
130 135 140 
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Lys Gly Lys Gly Gly Ser Met His Phe Tyr Lys Lys Glu Ser Ser Phe 
145 150 155 160 



Tyr Gly Gly His Gly lie Val Gly Ala Gin Val Pro Leu Gly Cys Gly 
165 170 175 

He Ala Phe Ala Gin Lys Tyr Asn Lys Glu Glu Ala Val Thr Phe Ala 
180 185 190 

Leu Tyr Gly Asp Gly Ala Ala Asn Gin Gly Gin Leu Phe Glu Ala Leu 
195 200 205 

Asn He Ser Ala Leu Trp Asp Leu Pro Ala He Leu Val Cys Glu Asn 
210 215 220 

Asn His Tyr Gly Met Gly Thr Ala Glu Trp Arg Ala Ala Lys Ser Pro 
225 230 235 240 

Ser Tyr Tyr Lys Arg Gly Asp Tyr Val Pro Gly Lea Lys Val Asp Gly 
245 250 255 

Met Asp Ala Phe Ala Val Lys Gin Ala Cys Lys Phe Ala Lys Gin His 
260 265 270 

Ala Leu Glu Lys Gly Pro He He Leu Glu Met Asp Thr Tyr Arg Tyr 
275 280 285 

His Gly His Ser Met Ser Asp Pro Gly Ser Thr Tyr Arg Thr Arg Asp 
290 295 300 

Glu He Ser Gly Val Arg Gin Glu Arg Asp Pro He Glu Arg He Lys 
305 310 315 320 

Lys Leu Val Leu Ser His Asp Leu Ala Thr Glu Lys Glu Leu Lys Asp 
325 330 335 

Met Glu Lys Glu He Arg Lys Glu Val Asp Asp Ala He Ala Lys Ala 
340 345 350 

Lys Asp Cys Pro Met Pro Glu Pro Ser Glu Leu Phe Thr Asn Val Tyr 
355 360 365 

Val Lys Gly Phe Gly Thr Glu Ser Phe Gly Pro Asp Arg Lys Glu Val 
370 375 380 

Lys Ala Ser Leu Pro 
385 
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<210> 36 
<211> 390 
<212> PRT 

<213> H. sapiens II 
<400> 36 

Met Arg Lys Met Leu Ala Ala Val Ser Arg Val Leu Ser Gly Ala Ser 



Gin Lys Pro Ala Ser Arg Val Leu Val Ala Ser Arg Asn Phe Ala Asn 
20 25 30 

Asp Ala Thr Phe Glu lie Lys Lys Cys Asp Leu His Arg Leu Glu Glu 
35 40 45 

Gly Pro Pro Val Thr Thr Val Leu Thr Arg Glu Asp Gly Leu Lys Tyr 
50 55 60 

Tyr Arg Met Met Gin Thr Val Arg Arg Met Glu Leu Lys Ala Asp Gin 
65 70 75 80 

Leu Tyr Lys Gin Lys lie lie Arg Gly Phe Cys His Leu Cys Asp Gly 



Gin Glu Ala Cys Cys Val Gly Leu Glu Ala Gly lie Asn Pro Thr Asp 
100 105 110 

His Leu lie Thr Ala Tyr Arg Ala His Gly Phe Thr Phe Thr Arg Gly 
115 120 125 

Leu Ser Val Arg Glu lie Leu Ala Glu Leu Thr Gly Arg Lys Gly Gly 
130 135 140 

Cys Ala Lys Gly Lys Gly Gly Ser Met His Met Tyr Ala Lys Asn Phe 
145 150 155 160 

Tyr Gly Gly Asn Gly lie Val Gly Ala Gin Val Pro Leu Gly Ala Gly 
165 170 175 

lie Ala Leu Ala Cys Lys Tyr Asn Gly Lys Asp Glu Val Cys Leu Thr 
180 185 190 

Leu Tyr Gly Asp Gly Ala Ala Asn Gin Gly Gin lie Phe Glu Ala Tyr 
195 200 205 

Asn Met Ala Ala Leu Trp Lys Leu Pro Cys lie Phe lie Cys Glu Asn 
210 215 220 



Asn Arg Tyr Gly Met Gly Thr Ser Val Glu Arg Ala Ala Ala Ser Thr 
225 230 235 240 



Asp Tyr Tyr Lys 



Met Asp lie Leu 
260 

Cys Arg Ser Gly 
275 

Tyr His Gly His 
290 

Glu Glu He Gin 
305 

Lys Asp Arg Met 



Glu He Asp Val 
340 

Ala Thr Ala Asp 
355 

Tyr Ser Ser Asp 
370 

Lys Phe Lys Ser 
385 



Arg Gly Asp Phe 
245 

Cys Val Arg Glu 



Lys Gly Pro He 
280 

Ser Met Ser Asp 
295 

Glu Val Arg Ser 
310 

Val Asn Ser Asn 
325 

Glu Val Arg Lys 



Pro Glu Pro Pro 
360 

Pro Pro Phe Glu 
375 

Val Ser 
390 



He Pro Gly Leu 
250 

Ala Thr Arg Phe 
265 

Leu Met Glu Leu 



Pro Gly Val Ser 
300 

Lys Ser Asp Pro 
315 

Leu Ala Ser Val 
330 

Glu He Glu Asp 
345 

Leu Glu Glu Leu 



Val Arg Gly Ala 
380 



Arg Val Asp Gly 
255 

Ala Ala Ala Tyr 
270 

Gin Thr Tyr Arg 
285 

Tyr Arg Thr Arg 



He Met Leu Leu 
320 

Glu Glu Leu Lys 
335 

Ala Ala Gin Phe 
350 

Gly Tyr His He 
365 

Asn Gin Trp He 



<210> 37 
<211> 420 
<212> PRT 

<213> S. cerevisiae 
<400> 37 

Met Leu Ala Ala Ser Phe Lys Arg Gin Pro Ser Gin Leu Val Arg Gly 
15 10 15 

Leu Gly Ala Val Leu Arg Thr Pro Thr Arg He Gly His Val Arg Thr 
20 25 30 

Met Ala Thr Leu Lys Thr Thr Asp Lys Lys Ala Pro Glu Asp He Glu 
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Gly Ser Asp Thr Val Gin lie Glu Leu Pro Glu Ser Ser Phe Glu Ser 
50 55 60 

Tyr Met Leu Glu Pro Pro Asp Leu Ser Tyr Glu Thr Ser Lys Ala Thr 
65 70 75 80 

Leu Leu Gin Met Tyr Lys Asp Met Val lie lie Arg Arg Met Glu Met 



Ala Cys Asp Ala Leu Tyr Lys Ala Lys Lys lie Arg Gly Phe Cys His 
100 105 110 

Leu Ser Val Gly Gin Glu Ala lie Ala Val Gly lie Glu Asn Ala lie 
115 120 125 

Thr Lys Leu Asp Ser lie He Thr Ser Tyr Arg Cys His Gly Phe Thr 
130 135 140 

Phe Met Arg Gly Ala Ser Val Lys Ala Val Leu Ala Glu Leu Met Gly 
145 150 155 160 

Arg Arg Ala Gly Val Ser Tyr Gly Lys Gly Gly Ser Met His Leu Tyr 
165 170 175 

Ala Pro Gly Phe Tyr Gly Gly Asn Gly He Val Gly Ala Gin Val Pro 
180 185 190 

Leu Gly Ala Gly Leu Ala Phe Ala His Gin Tyr Lys Asn Glu Asp Ala 
195 200 205 

Cys Ser Phe Thr Leu Tyr Gly Asp Gly Ala Ser Asn Gin Gly Gin Val 
210 215 220 

Phe Glu Ser Phe Asn Met Ala Lys Leu Trp Asn Leu Pro Val Val Phe 
225 230 235 240 

Cys Cys Glu Asn Asn Lys Tyr Gly Met Gly Thr Ala Ala Ser Arg Ser 
245 250 255 

Ser Ala Met Thr Glu Tyr Phe Lys Arg Gly Gin Tyr He Pro Gly Leu 
260 265 270 

Lys Val Asn Gly Met Asp He Leu Ala Val Tyr Gin Ala Ser Lys Phe 
275 280 285 

Ala Lys Asp Trp Cys Leu Ser Gly Lys Gly Pro Leu Val Leu Glu Tyr 



290 



295 



300 



Glu Thr Tyr Arg 
305 

Tyr Arg Thr Arg 



lie Ala Gly Leu 
340 

Ala Glu Val Lys 
355 

Gin Val Glu Leu 
370 

lie Leu Phe Glu 
385 

Arg Gly Arg lie 



Tyr Gly Gly His 
310 

Asp Glu lie Gin 
325 

Lys Met His Leu 



Ala Tyr Asp Lys 
360 

Ala Asp Ala Ala 
375 

Asp Val Tyr Val 
390 

Pro Glu Asp Thr 
405 



Ser Met Ser Asp 
315 

His Met Arg Ser 
330 

lie Asp Leu Gly 
345 

Ser Ala Arg Lys 



Pro Pro Pro Glu 
380 

Lys Gly Thr Glu 
395 

Trp Asp Phe Lys 
410 



Pro Gly Thr Thr 
320 

Lys Asn Asp Pro 
335 

lie Ala Thr Glu 
350 

Tyr Val Asp Glu 
365 

Ala Lys Leu Ser 



Thr Pro Thr Leu 
400 

Lys Gin Gly Phe 
415 



Ala Ser Arg Asp 
420 



<210> 38 
<211> 396 
<212> PRT 
<213> A. suum I 

<400> 38 

Met lie Phe Val Phe Ala Asn lie Phe Lys Val Pro Thr Val Ser Pro 
15 10 15 

Ser Val Met Ala He Ser Val Arg Leu Ala Ser Thr Glu Ala Thr Phe 
20 25 30 

Gin Thr Lys Pro Phe Lys Leu His Lys Leu Asp Ser Gly Pro Asp He 
35 40 45 

Asn Val His Val Thr Lys Glu Asp Ala Val His Tyr Tyr Thr Gin Met 
50 55 60 

Leu Thr He Arg Arg Met Glu Ser Ala Ala Gly Asn Leu Tyr Lys Glu 
65 70 75 80 
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Lys Lys Val Arg Gly Phe Cys His Leu Tyr Ser Gly Gin Glu Ala Cys 
85 90 95 



Ala Val Gly Thr Lys Ala Ala Met Asp Ala Gly Asp Ala Ala Val Thr 
100 105 110 

Ala Tyr Arg Cys His Gly Trp Thr Tyr Leu Ser Gly Ser Ser Val Ala 
115 120 125 

Lys Val Leu Cys Glu Leu Thr Gly Arg lie Thr Gly Asn Val Tyr Gly 
130 135 140 

Lys Gly Gly Ser Met His Met Tyr Gly Glu Asn Phe Tyr Gly Gly Asn 
145 150 155 160 

Gly He Val Gly Ala Gin Gin Pro Leu Gly Thr Gly He Ala Phe Ala 
1S5 170 175 

Met Lys Tyr Arg Lys Glu Lys Asn Val Cys He Thr Met Phe Gly Asp 
180 185 190 

Gly Ala Thr Asn Gin Gly Gin Leu Phe Glu Ser Met Asn Met Ala Lys 
195 200 205 

Leu Trp Asp Leu Pro Val Leu Tyr Val Cys Glu Asn Asn Gly Tyr Gly 
210 215 220 

Met Gly Thr Ala Ala Ala Arg Ser Ser Ala Ser Thr Asp Tyr Tyr Thr 
225 230 235 240 

Arg Gly Asp Tyr Val Pro Gly He Trp Val Asp Gly Met Asp Val Leu 
245 250 255 

Ala Val Arg Gin Ala Val Arg Trp Ala Lys Glu Trp Cys Asn Ala Gly 
260 265 270 

Lys Gly Pro Leu Met He Glu Met Ala Thr Tyr Arg Tyr Ser Gly His 
275 280 285 

Ser Met Ser Asp Pro Gly Thr Ser Tyr Arg Thr Arg Glu Glu Val Gin 
290 295 300 

Glu Val Arg Lys Thr Arg Asp Pro He Thr Gly Phe Lys Asp Lys He 
305 310 315 320 

Val Thr Ala Gly Leu Val Thr Glu Asp Glu He Lys Glu He Asp Lys 
325 330 335 



30 



Gin Val 



Arg Lys Glu He Asp Ala Ala Val Lys Gin Ala His Thr Asp 
340 345 350 



Lys Glu Ser Pro Val Glu Leu Met Leu Thr Asp He Tyr Tyr Asn Thr 
355 360 365 

Pro Ala Gin Tyr Val Arg Cys Thr Thr Asp Glu Val Leu Gin Lys Tyr 
370 375 380 

Leu Thr Ser Glu Glu Ala Val Lys Ala Leu Ala Lys 
385 390 395 



<210> 39 
<211> 370 
<212> PRT 

<213> M. capricolum 
<400> 39 

Met Thr Tyr Leu Gly Lys Phe Asp Pro Leu Lys Asn Glu Lys Val Cys 



Val Leu Asp Lys Asp Gly Lys Val He Asn Pro Lys Leu Met Pro Lys 



He Ser Asp Gin Glu He Leu Glu Ala Tyr Lys He Met Asn Leu Ser 



Arg Arg Gin Asp He Tyr Gin Asn Thr Met Gin Arg Gin Gly Arg Leu 
50 55 60 

Leu Ser Phe Leu Ser Ser Thr Gly Gin Glu Ala Cys Glu Val Ala Tyr 



He Asn Ala Leu Asn Lys Lys Thr Asp His Phe Val Ser Gly Tyr Arg 



Asn Asn Ala Ala Trp Leu Ala Met Gly Gin Leu Val Arg Asn He Met 
100 105 HO 

Leu Tyr Trp He Gly Asn Glu Ala Gly Gly Lys Ala Pro Glu Gly Val 
115 120 125 

Asn Cys Leu Pro Pro Asn He Val He Gly Ser Gin Tyr Ser Gin Ala 
130 135 140 

Thr Gly He Ala Phe Ala Asp Lys Tyr Arg Lys Thr Gly Gly Val Val 
145 150 155 160 
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Val Thr Thr Thr Gly Asp Gly Gly Ser Ser Glu Gly Glu Thr Tyr Glu 
165 170 175 



Ala Met Asn Phe Ala Lys Leu His Glu Val Pro Cys lie Phe Val lie 
180 185 190 

Glu Asn Asn Lys Trp Ala lie Ser Thr Ala Arg Ser Glu Gin Thr Lys 
195 200 205 

Ser lie Asn Phe Ala Val Lys Gly lie Ala Thr Gly He Pro Ser He 
210 215 220 

He Val Asp Gly Asn Asp Tyr Leu Ala Cys He Gly Val Phe Lys Glu 
225 230 235 240 

Val Val Glu Tyr Val Arg Lys Gly Asn Gly Pro Val Leu Val Glu Cys 
245 250 255 

Asp Thr Tyr Arg Leu Gly Ala His Ser Ser Ser Asp Asn Pro Asp Ala 
260 265 270 

Tyr Arg Pro Lys Gly Glu Phe Glu Glu Met Ala Lys Phe Asp Pro Leu 
275 280 285 

He Arg Leu Lys Gin Tyr Leu He Asp Lys Lys He Trp Ser Asp Glu 
290 295 300 

Gin Gin Ala Gin Leu Glu Ala Glu Gin Asp Lys Phe Val Ala Asp Glu 
305 310 315 320 

Phe Ala Trp Val Glu Lys Asn Lys Asn Tyr Asp Leu He Asp He Phe 
325 330 335 

Lys Tyr Gin Tyr Asp Lys Met Asp He Phe Leu Glu Glu Gin Tyr Lys 
340 345 350 

Glu Ala Lys Glu Phe Phe Glu Lys Tyr Pro Glu Ser Lys Glu Gly Gly 
355 360 365 

His His 
370 



<210> 40 

<211> 369 

<212> PRT 

<213> B. subtilis 
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<400> 40 

Met Gly Val Lys Thr Phe Gin Phe Pro Phe Ala Glu Gin Leu Glu Lys 



Val Ala Glu Gin Phe Pro Thr Phe Gin lie Leu Asn Glu Glu Gly Glu 

20 25 30 

Val Val Asn Glu Glu Ala Met Pro Glu Leu Ser Asp Glu Gin Leu Lys 

35 40 45 

Glu Leu Met Arg Arg Met Val Tyr Thr Arg lie Leu Asp Gin Arg Ser 

50 55 60 

lie Ser Leu Asn Arg Gin Gly Arg Leu Gly Phe Tyr Ala Pro Thr Ala 



Gly Gin Glu Ala Ser Gin lie Ala Ser His Phe Ala Leu Glu Lys Glu 
85 90 95 

Asp Phe He Leu Pro Gly Tyr Arg Asp Val Pro Gin He He Trp His 
100 105 110 

Gly Leu Pro Leu Tyr Gin Ala Phe Leu Phe Ser Arg Gly His Phe His 
115 120 125 

Gly Asn Gin He Pro Glu Gly Val Asn Val Leu Pro Pro Gin He He 
130 135 140 

He Gly Ala Gin Tyr He Gin Ala Ala Gly Val Ala Leu Gly Leu Lys 
145 150 155 160 

Met Arg Gly Lys Lys Ala Val Ala He Thr Tyr Thr Gly Asp Gly Gly 
165 170 175 

Thr Ser Gin Gly Asp Phe Tyr Glu Gly He Asn Phe Ala Gly Ala Phe 
180 185 190 

Lys Ala Pro Ala He Phe Val Val Gin Asn Asn Arg Phe Ala He Ser 
195 200 205 

Thr Pro Val Glu Lys Gin Thr Val Ala Lys Thr Leu Ala Gin Lys Ala 
210 215 220 

Val Ala Ala Gly He Pro Gly He Gin Val Asp Gly Met Asp Pro Leu 
225 230 235 240 

Ala Val Tyr Ala Ala Val Lys Ala Ala Arg Glu Arg Ala He Asn Gly 



245 



255 



Glu Gly Pro Thr Leu He Glu Thr 
260 

Thr Met Ser Gly Asp Asp Pro Thr 
275 280 

Asn Glu Trp Ala Lys Lys Asp Pro 
290 295 

Glu Ala Lys Gly Leu Trp Ser Glu 
305 310 

Gin Ala Lys Glu Glu He Lys Glu 
325 

Pro Lys Gin Lys Val Thr Asp Leu 
340 

Pro Phe Asn Leu Lys Glu Gin Tyr 

355 360 



Leu Cys Phe Arg Tyr Gly Pro His 

265 270 

Arg Tyr Arg Ser Lys Glu Leu Glu 
285 

Leu Val Arg Phe Arg Lys Phe Leu 
300 

Glu Glu Glu Asn Asn Val He Glu 
315 320 

Ala He Lys Lys Ala Asp Glu Thr 
330 335 

He Ser He Met Phe Glu Glu Leu 
345 350 

Glu He Tyr Lys Glu Lys Glu Ser 
365 



Lys 



<210> 41 
<211> 129 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Consensus 
<400> 41 

Leu Tyr Met Arg Arg Glu Leu Tyr Gly Phe His Leu Gly Gin Glu Ala 
15 10 15 

Gly Lys Asp Tyr Arg His Gly Ser Val Glu Leu Gly Gly Gly Gly Gly 
20 25 30 

Ser Met His Phe Gly Gly He Gly Ala Gin Pro Gly Ala Phe Ala Lys 
35 40 45 

Tyr Arg Val Thr Gly Asp Gly Asn Gin Gly Gin Phe Glu Asn Met Ala 
50 55 60 
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Leu Trp Leu Pro He Phe Val Glu Asn Asn Gly Thr Ala Arg Lys Gly 
65 70 75 80 



Pro Gly Val Asp Gly Met Asp Leu Ala Val Ala Lys Ala Gly Gly Pro 
85 90 95 

Leu Glu Thr Tyr Arg Tyr Gly His Ser Met Ser Asp Pro Tyr Arg Arg 
100 105 HO 

Glu Asp Pro He Leu Lys Leu Ala Glu Glu Lys Lys Ala Ala Pro Pro 
115 120 125 

Leu 



<210> 42 

<211> 406 

<212> PRT 

<213> Arabidopsis 



tha liana 



<400> 42 

Met Ser Ser He 



Thr Phe Asn Ser 
20 

Thr Asn Leu Ser 
35 

Ala Ser Lys Lys 
50 

Lys Leu He Pro 
65 

Ser Thr Gly His 



Glu Glu Glu Met 
100 

Val Gly His Tyr 
115 

Lys Phe Gly Asp 
130 



He His Gly Ala 
5 

Val Asp Ser Lys 



Val Arg Ser Gin 
40 

Ser Phe Gly Ser 
55 

Asn Ala Val Ala 
70 

Glu Leu Leu Leu 
85 

Asp Arg Asp Pro 

Gly Gly Ser Tyr 
120 

Leu Arg Val Leu 
135 



Gly Ala Ala Thr 
10 

Lys Leu Phe Val 
25 

Arg Tyr He Val 



Gly Leu Arg Val 
60 

Thr Lys Glu Ala 
75 

Phe Glu Ala Leu 
90 

His Val Cys Val 
105 

Lys Val Thr Lys 



Asp Thr Pro He 
14 0 



Thr Thr Leu Ser 
15 

Ala Pro Ser Arg 
30 

Ala Gly Ser Asp 
45 

Arg His Ser Gin 



Asp Thr Ser Ala 
80 

Gin Glu Gly Leu 
95 

Met Gly Glu Asp 
110 

Gly Leu Ala Asp 
125 

Cys Glu Asn Ala 
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Phe Thr Gly Met Gly lie Gly Ala Ala Met Thr Gly Leu Arg Pro Val 
145 150 155 160 



lie Glu Gly Met Asn Met Gly Phe Leu Leu Leu Ala Phe Asn Gin lie 
165 170 175 

Ser Asn Asn Cys Gly Met Leu His Tyr Thr Ser Gly Gly Gin Phe Thr 
180 185 190 

lie Pro Val Val lie Arg Gly Pro Gly Gly Val Gly Arg Gin Leu Gly 
195 200 205 

Ala Glu His Ser Gin Arg Leu Glu Ser Tyr Phe Gin Ser lie Pro Gly 
210 215 220 

lie Gin Met Val Ala Cys Ser Thr Pro Tyr Asn Ala Lys Gly Leu Met 
225 230 235 240 

Lys Ala Ala He Arg Ser Glu Asn Pro Val He Leu Phe Glu His Val 
245 250 255 

Leu Leu Tyr Asn Leu Lys Glu Lys He Pro Asp Glu Asp Tyr He Cys 
260 265 270 

Asn Leu Glu Glu Ala Glu Met Val Arg Pro Gly Glu His He Thr He 
275 280 285 

Leu Thr Tyr Ser Arg Met Arg Tyr His Val Met Gin Ala Ala Lys Thr 
290 295 300 

Leu Val Asn Lys Gly Tyr Asp Pro Glu Val He Asp He Arg Ser Leu 
305 310 315 320 

Lys Pro Phe Asp Leu His Thr He Gly Asn Ser Val Lys Lys Thr His 
325 330 335 

Arg Val Leu He Val Glu Glu Cys Met Arg Thr Gly Gly He Gly Ala 
340 345 350 

Ser Leu Thr Ala Ala He Asn Glu Asn Phe His Asp Tyr Leu Asp Ala 
355 360 365 

Pro Val Met Cys Leu Ser Ser Gin Asp Val Pro Thr Pro Tyr Ala Gly 
370 375 380 

Thr Leu Glu Glu Trp Thr Val Val Gin Pro Ala Gin He Val Thr Ala 
385 390 395 400 



36 



Val Glu Gin Leu Cys Gin 
405 



<210> 43 

<211> 331 

<212> PRT 

<213> P. purpurea 

<400> 43 

Met Ser Lys Val Phe Met Phe Asp Ala Leu Arg Ala Ala Thr Asp Glu 



Glu Met Glu Lys Asp Leu Thr Val Cys Val lie Gly Glu Asp Val Gly 



His Tyr Gly Gly Ser Tyr Lys Val Thr Lys Asp Leu His Ser Lys Tyr 



Gly Asp Leu Arg Val Leu Asp Thr Pro lie Ala Glu Asn Ser Phe Thr 



Gly Met Ala lie Gly Ala Ala lie Thr Gly Leu Arg Pro lie Val Glu 



Gly Met Asn Met Ser Phe Leu Leu Leu Ala Phe Asn Gin lie Ser Asn 



Asn Ala Gly Met Leu Arg Tyr Thr Ser Gly Gly Asn Phe Thr Leu Pro 

100 105 110 

Leu Val lie Arg Gly Pro Gly Gly Val Gly Arg Gin Leu Gly Ala Glu 

115 120 125 

His Ser Gin Arg Leu Glu Ala Tyr Phe Gin Ala lie Pro Gly Leu Lys 

130 135 140 

lie Val Ala Cys Ser Thr Pro Tyr Asn Ala Lys Gly Leu Leu Lys Ser 

145 150 155 160 

Ala lie Arg Asp Asn Asn Pro Val Val Phe Phe Glu His Val Leu Leu 

165 170 175 

Tyr Asn Leu Gin Glu Glu He Pro Glu Asp Glu Tyr Leu He Pro Leu 

180 185 190 

Asp Lys Ala Glu Val Val Arg Lys Gly Lys Asp He Thr He Leu Thr 
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195 



200 



205 



Tyr Ser Arg Met 
210 

Asn Asp Gly Tyr 
225 

Leu Asp lie Asp 



Leu He Val Glu 
260 

He Ala Gin He 
275 

Val Arg Leu Ser 
290 



Arg His His Val 
215 

Asp Pro Glu Val 
230 

Ser He Ser Val 
245 

Glu Cys Met Lys 



Asn Glu His Leu 
280 

Ser Gin Asp He 
295 



Thr Glu Ala Leu 
220 

Leu Asp Leu He 
235 

Ser Val Lys Lys 
250 

Thr Ala Gly He 
265 

Phe Asp Glu Leu 



Pro Thr Pro Tyr 
300 



Pro Leu Leu Leu 



Ser Leu Lys Pro 
240 

Thr His Arg Val 
255 

Gly Ala Glu Leu 
270 

Asp Ala Pro Val 
285 

Asn Gly Ser Leu 



Glu Gin Ala Thr Val He Gin Pro His Gin He He Asp Ala Val Lys 
305 310 315 320 

Asn He Val Asn Ser Ser Lys Thr He Thr Thr 
325 330 



<210> 44 
<211> 363 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 44 

Met Leu Gly He Leu Arg Gin Arg Ala He Asp Gly Ala Ser Thr Leu 
15 10 15 

Arg Arg Thr Arg Phe Ala Leu Val Ser Ala Arg Ser Tyr Ala Ala Gly 
20 25 30 

Ala Lys Glu Met Thr Val Arg Asp Ala Leu Asn Ser Ala He Asp Glu 
35 40 45 

Glu Met Ser Ala Asp Pro Lys Val Phe Val Met Gly Glu Glu Val Gly 
50 55 60 

Gin Tyr Gin Gly Ala Tyr Lys He Thr Lys Gly Leu Leu Glu Lys Tyr 
65 70 75 80 
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Gly Pro Glu Arg Val Tyr Asp Thr Pro lie Thr Glu Ala Gly Phe Thr 
85 90 95 



Gly lie Gly Val 
100 

Phe Met Thr Phe 
115 

Ser Ala Ala Lys 
130 

He Val Phe Arg 
145 

His Ser Gin Cys 



Val Leu Ala Pro 
180 

Ala He Arg Asp 
195 



Gly Ala Ala Tyr 



Asn Phe Ser Met 
120 

Ser Asn Tyr Met 
135 

Gly Pro Asn Gly 
150 

Tyr Ala Ala Trp 
165 

Tyr Ser Ala Glu 



Pro Asp Pro Val 
200 



Ala Gly Leu Lys 
105 

Gin Ala He Asp 



Ser Ala Gly Gin 
140 

Ala Ala Ala Gly 
155 

Tyr Ala Ser VaL 
170 

Asp Ala Arg Gly 
185 

Val Phe Leu Glu 



Pro Val Val Glu 
110 

His He He Asn 
125 

He Asn Val Pro 



Val Gly Ala Gin 
160 

Pro Gly Leu Lys 
175 

Leu Leu Lys Ala 
190 

Asn Glu Leu Leu 
205 



Tyr Gly Glu Ser 
210 

Cys Leu Pro He 
225 

Thr He Val Thr 



Glu Lys Leu Ala 
260 

Ser He Arg Pro 
275 

Thr Ser Arg Leu 
290 

Cys Ala Glu He 
305 

Asp Ala Pro Val 



Phe Pro He Ser 
215 

Gly Lys Ala Lys 
230 

Phe Ser Lys Met 
245 

Glu Glu Gly He 



Leu Asp Arg Ala 
280 

Val Thr Val Glu 
295 

Cys Ala Ser val 
310 

Glu Arg He Ala 
325 



Glu Glu Ala Leu 
220 

He Glu Arg Glu 
235 

Val Gly Phe Ala 
250 

Ser Ala Glu Val 
265 

Thr He Asn Ala 



Glu Gly Phe Pro 
300 

Val Glu Glu Ser 
315 

Gly Ala Asp Val 
330 



Asp Ser Ser Phe 



Gly Lys Asp Val 
240 

Leu Lys Ala Ala 
255 

He Asn Leu Arg 
270 

Ser Val Arg Lys 
285 

Gin His Gly Val 



Phe Ser Tyr Leu 
320 

Pro He Pro Tyr 
335 



Thr Ala Asn Leu Glu Arg Leu Ala 
340 

Arg Ala Ser Lys Arg Ala Cys Tyr 
355 360 



Leu Pro Gin lie Glu Asp lie Val 
345 350 

Arg Ser Lys 



<210> 45 
<211> 359 
<212> PRT 
<213> H. sapiens 

<400> 45 

Met Ala Ala Val Ser Gly Leu Val Arg Arg Pro Leu Arg Glu Val Ser 
15 10 15 

Gly Leu Leu Lys Arg Arg Phe His Trp Thr Ala Pro Ala Ala Leu Gin 
20 25 30 

Val Thr Val Arg Asp Ala lie Asn Gin Gly Met Asp Glu Glu Leu Glu 
35 40 45 

Arg Asp Glu Lys Val Phe Leu Leu Gly Glu Glu Val Ala Gin Tyr Asp 
50 55 60 

Gly Ala Tyr Lys Val Ser Arg Gly Leu Trp Lys Lys Tyr Gly Asp Lys 
65 70 75 80 

Arg He He Asp Thr Pro He Ser Glu Met Gly Phe Ala Gly He Ala 
85 90 95 

Val Gly Ala Ala Met Ala Gly Leu Arg Pro He Cys Glu Phe Met Thr 
100 105 110 

Phe Asn Phe Ser Met Gin Ala He Asp Gin Val He Asn Ser Ala Ala 
115 120 125 

Lys Thr Tyr Tyr Met Ser Gly Gly Leu Gin Pro Val Pro He Val Phe 
130 135 140 

Arg Gly Pro Asn Gly Ala Ser Ala Gly Val Ala Ala Gin His Ser Gin 
145 150 155 160 

Cys Phe Ala Ala Trp Tyr Gly His Cys Pro Gly Leu Lys Val Val Ser 
165 170 175 

Pro Trp Asn Ser Glu Asp Ala Lys Gly Leu He Lys Ser Ala He Arg 
180 185 190 



40 



Asp Asn Asn Pro Val Val Val Leu Glu Asn Glu Leu Met Tyr Gly Val 
195 200 205 

Pro Phe Glu Phe Leu Pro Glu Ala Gin Ser Lys Asp Phe Leu lie Pro 
210 215 220 

lie Gly Lys Ala Lys lie Glu Arg Gin Gly Thr His lie Thr Val Val 
225 230 235 240 

Ser His Ser Arg Pro Val Gly His Cys Leu Glu Ala Ala Ala Val Leu 
245 250 255 

Ser Lys Glu Gly Val Glu Cys Glu Val lie Asn Met Arg Thr lie Arg 
260 265 270 

Pro Met Asp Met Glu Thr lie Glu Ala Ser Val Met Lys Thr Asn His 
275 280 285 

Leu Val Thr Val Glu Gly Gly Trp Pro Gin Phe Gly Val Gly Ala Glu 
290 295 300 

lie Cys Ala Arg lie Met Glu Gly Pro Ala Phe Asn Phe Leu Asp Ala 
305 310 315 320 

Pro Ala Val Arg Val Thr Gly Ala Asp Val Pro Met Pro Tyr Ala Lys 
325 330 335 

lie Leu Glu Asp Asn Ser lie Pro Gin Val Lys Asp lie lie Phe Ala 
340 345 350 

lie Lys Lys Thr Leu Asn lie 
355 



<210> 46 
<211> 366 
<212> PRT 

<213> S. cerevisiae 
<400> 46 

Met Phe Ser Arg Leu Pro Thr Ser Leu Ala Arg Asn Val Ala Arg Arg 
15 10 15 

Ala Pro Thr Ser Phe Val Arg Pro Ser Ala Ala Ala Ala Ala Leu Arg 
20 25 30 



Phe Ser Ser Thr Lys Thr Met Thr Val Arg Glu Ala Leu Asn Ser Ala 



35 40 45 

Met Ala Glu Glu Leu Asp Arg Asp Asp Asp Val Phe Leu He Gly Glu 
50 55 60 

Glu Val Ala Gin Tyr Asn Gly Ala Tyr Lys Val Ser Lys Gly Leu Leu 
65 70 75 80 

Asp Arg Phe Gly Glu Arg Arg Val Val Asp Thr Pro He Thr Glu Tyr 
85 90 95 

Gly Phe Thr Gly Leu Ala Val Gly Ala Ala Leu Lys Gly Leu Lys Pro 
100 105 HO 

He Val Glu Phe Met Ser Phe Asn Phe Ser Met Gin Ala He Asp His 
115 120 125 

Val Val Asn Ser Ala Ala Lys Thr His Tyr Met Ser Gly Gly Thr Gin 
130 135 140 

Lys Cys Gin Met Val Phe Arg Gly Pro Asn Gly Ala Ala Val Gly Leu 
145 150 155 160 

Gly Ala Gin His Ser Gin Asp Phe Ser Pro Trp Tyr Gly Ser He Pro 
165 170 175 

Gly Leu Lys Val Leu Val Pro Tyr Ser Ala Glu Asp Ala Arg Gly Leu 
180 185 190 

Leu Lys Ala Ala He Arg Asp Pro Asn Pro Val Val Phe Leu Glu Asn 
195 200 205 

Glu Leu Leu Tyr Gly Glu Ser Phe Glu He Ser Glu Glu Ala Leu Ser 
210 215 220 

Pro Glu Phe Thr Leu Pro Tyr Lys Ala Lys He Glu Arg Glu Gly Thr 
225 230 235 240 

Asp He Ser He Val Thr Tyr Thr Arg Asn Val Gin Phe Ser Leu Glu 
245 250 255 

Ala Ala Glu He Leu Gin Lys Lys Tyr Gly Val Ser Ala Glu Val He 
260 265 270 

Asn Leu Arg Ser He Arg Pro Leu Asp Thr Glu Ala He He Lys Thr 
275 280 285 

Val Lys Lys Thr Asn His Leu He Thr Val Glu Ser Thr Phe Pro Ser 
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290 



295 



300 



Phe Gly Val Gly Ala Glu lie Val 
305 310 

Phe Asp Tyr Leu Asp Ala Pro lie 
325 

Pro Thr Pro Tyr Ala Lys Glu Leu 
340 

Pro Thr He Val Lys Ala Val Lys 
355 360 



Ala Gin Val Met Glu Ser Glu Ala 
315 320 

Gin Arg Val Thr Gly Ala Asp Val 
330 335 

Glu Asp Phe Ala Phe Pro Asp Thr 
345 350 

Glu Val Leu Ser He Glu 
365 



<210> 47 
<211> 361 
<212> PRT 
<213> A. suum 

<400> 47 
Met Ala Val Asn 
1 

Ala Cys Ala Leu 
20 

Asn Val Thr Val 
35 

Lys Arg Asp Asp 
50 

Asp Gly Ala Tyr 
65 

Gly Arg He Trp 



Ser Val Gly Ala 
100 

Ser Met Asn Phe 
115 

Ala Lys Ala His 
130 



Gly Cys Met Arg 
5 

Glu Gin Ser Val 



Arg Asp Ala Leu 
40 

Arg Val Phe Leu 
55 

Lys He Ser Lys 
70 

Asp Thr Pro He 
85 

Ala Met Asn Gly 



Ser Met Gin Gly 
120 

Tyr Met Ser Ala 
135 



Leu Leu Arg Asn 
10 

Arg Arg Leu Ala 
25 

Asn Ala Ala Leu 



He Gly Glu Glu 
60 

Gly Leu Trp Lys 
75 

Thr Glu Met Ala 
90 

Leu Arg Pro He 
105 

He Asp His He 



Gly Arg Phe His 
140 



Gly Leu Thr Ser 
15 

Ser Gly Thr Leu 
30 

Asp Glu Glu He 
45 

Val Ala Gin Tyr 



Lys Tyr Gly Asp 
80 

He Ala Gly Leu 
95 

Cys Glu Phe Met 
110 

He Asn Ser Ala 
125 

Val Pro He Val 
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Phe Arg Gly Ala Asn Gly Ala Ala Val Gly Val Ala Gin Gin His Ser 
145 150 155 160 



Gin Asp Phe Thr Ala Trp Phe Met His Cys Pro Gly Val Lys Val Val 
165 170 175 

Val Pro Tyr Asp Cys Glu Asp Ala Arg Gly Leu Leu Lys Ala Ala Val 
180 185 190 

Arg Asp Asp Asn Pro Val lie Cys Leu Glu Asn Glu lie Leu Tyr Gly 
195 200 205 

Met Lys Phe Pro Val Ser Pro Glu Ala Gin Ser Pro Asp Phe Val Leu 
210 215 220 

Pro Phe Gly Gin Ala Lys lie Gin Arg Pro Gly Lys Asp lie Thr lie 
225 230 235 240 

Val Ser Leu Ser lie Gly Val Asp Val Ser Leu His Ala Ala Asp Glu 
245 250 255 

Leu Ala Lys Ser Gly lie Asp Cys Glu Val lie Asn Leu Arg Cys Val 
260 265 270 

Arg Pro Leu Asp Phe Gin Thr Val Lys Asp Ser Val lie Lys Thr Lys 
275 280 285 

His Leu Val Thr Val Glu Ser Gly Trp Pro Asn Cys Gly Val Gly Ala 
290 295 300 

Glu lie Ser Ala Arg Val Thr Glu Ser Asp Ala Phe Gly Tyr Leu Asp 
305 310 315 320 

Gly Pro lie Leu Arg Val Thr Gly Val Asp Val Pro Met Pro Tyr Ala 
325 330 335 

Gin Pro Leu Glu Thr Ala Ala Leu Pro Gin Pro Ala Asp Val Val Lys 
340 345 350 

Met Val Lys Lys Cys Leu Asn Val Gin 
355 360 



<210> 48 
<211> 329 
<212> PRT 

<213> M. capricolm 
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<400> 48 

Met Ala lie lie Asn Asn lie Lys Ala Val Thr Asp Ala Leu Asp Cys 
15 10 15 



Ala Met Gin Arg Asp Pro Asn Val lie Val Phe Gly Glu Asp Val Gly 
20 25 30 

Thr Glu Gly Gly Val Phe Arg Ala Thr Gin Gly Leu Ala Val Lys Phe 
35 40 45 

Gly Asn Asp Arg Cys Phe Asn Ala Pro lie Ser Glu Ala Met Phe Ala 
50 55 60 

Gly Val Gly Leu Gly Met Ala Met Asn Gly Met Lys Pro Val Leu Glu 
65 70 75 80 

Met Gin Phe Glu Gly Leu Gly Leu Ala Ser Leu Gin Asn lie Phe Thr 



Asn He Ser Arg Met Arg Asn Arg Thr Arg Gly Lys Tyr Thr Ala Pro 
100 105 110 

Met Val He Arg Met Pro Met Gly Gly Gly He Arg Ala Leu Glu His 
115 120 125 

His Ser Glu Ala Leu Glu Ala Val Tyr Ala His He Pro Gly Val Gin 
130 135 140 

He Val Cys Pro Ser Thr Pro Tyr Asp Thr Lys Gly Leu He Leu Ala 
145 150 155 160 

Ala He Asp Ser Pro Asp Pro Val He Val Val Glu Pro Thr Lys Leu 
165 ±70 175 

Tyr Arg Ala Phe Lys Gin Glu Val Pro Asp Glu His Tyr He Val Pro 
180 185 190 

He Gly Glu Gly Tyr Lys He Gin Glu Gly Asn Asp Leu Thr Val Val 
195 200 205 

Thr Tyr Gly Ala Gin Thr Val Asp Cys Gin Lys Ala He Ala Leu Leu 
210 215 220 

Lys Glu Thr His Pro Asn Ala Thr He Asp Leu He Asp Leu Arg Ser 
225 230 235 240 

He Lys Pro Trp Asp Lys Lys Met Val He Glu Ser Val Lys Lys Thr 
245 250 255 
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Gly Arg Leu Leu 
260 

Ala Glu He He 
275 

Ala Pro Leu Ser 
290 

Arg Gly Glu Gly 
305 

Met Gin Glu Leu 



Val Val His Glu 



Ala Thr Val Asn 
280 

Arg Cys Thr Gly 
295 

Tyr Phe Gin Val 
310 

Leu Asp Phe Lys 
325 



Ala Val Lys Ser 
265 

Glu Glu Cys Phe 



Tyr Asp Val He 
300 

Asn Pro Lys Lys 
315 

Phe 



Phe Ser Val Ser 
270 

Glu Tyr He Lys 
285 

Thr Pro Phe Asp 



Val Leu Val Lys 
320 



<210> 49 

<211> 325 

<212> PRT 

<213> B. subtilis 

<400> 49 

Met Ala Gin Met Thr Met Val Gin Ala He Thr Asp Ala Leu Arg He 
15 10 15 

Glu Leu Lys Asn Asp Pro Asn Val Leu He Phe Gly Glu Asp Val Gly 
20 25 30 

Val Asn Gly Gly Val Phe Arg Ala Thr Glu Gly Leu Gin Ala Glu Phe 
35 40 45 

Gly Glu Asp Arg Val Phe Asp Thr Pro Leu Ala Glu Ser Gly He Gly 
50 55 60 

Gly Leu Ala He Gly Leu Ala Leu Gin Gly Phe Arg Pro Val Pro Glu 
65 70 75 80 

He Gin Phe Phe Gly Phe Val Tyr Glu Val Met Asp Ser He Cys Gly 



Gin Met Ala Arg He Arg Tyr Arg Thr Gly Gly Arg Tyr His Met Pro 
100 105 HO 

He Thr He Arg Ser Pro Phe Gly Gly Gly Val His Thr Pro Glu Leu 
115 120 125 

His Ser Asp Ser Leu Glu Gly Leu Val Ala Gin Gin Pro Gly Leu Lys 
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130 



135 



14 0 



Val Val lie Pro Ser Thr Pro Tyr Asp Ala Lys Gly Leu Leu lie Ser 
145 150 155 160 

Ala lie Arg Asp Asn Asp Pro Val lie Phe Leu Glu His Leu Lys Leu 
165 170 175 

Tyr Arg Ser Phe Arg Gin Glu Val Pro Glu Gly Glu Tyr Thr lie Pro 
180 185 190 

lie Gly Lys Ala Asp lie Lys Arg Glu Gly Lys Asp lie Thr lie lie 
195 200 205 

Ala Tyr Gly Ala Met Val His Glu Ser Leu Lys Ala Ala Ala Glu Leu 
210 215 220 

Glu Lys Glu Gly lie Ser Ala Glu Val Val Asp Leu Arg Thr Val Gin 
225 230 235 240 

Pro Leu Asp lie Glu Thr lie lie Gly Ser Val Glu Lys Thr Gly Arg 
245 250 255 

Ala He Val Val Gin Glu Ala Gin Arg Gin Ala Gly He Ala Ala Asn 
260 265 270 

Val Val Ala Glu He Asn Glu Arg Ala He Leu Ser Leu Glu Ala Pro 
275 280 285 

Val Leu Arg Val Ala Ala Pro Asp Thr Val Tyr Pro Phe Ala Gin Ala 
290 295 300 

Glu Ser Val Trp Leu Pro Asn Phe Lys Asp Val He Glu Thr Ala Lys 
305 310 315 320 

Lys Val Met Asn Phe 
325 



<210> 50 
<211> 162 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: consensus 
<400> 50 
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Thr Ala Leu Ala 
1 

Tyr Lys Thr Lys 
20 

Gly Gly Ala Ala 
35 

Ala Tyr Ser Gly 
50 

Pro Gly Leu Lys 
65 

lie Arg Asp Asn 



lie Arg Gly Asp 
100 

Gly Glu Val lie 
115 

Arg Leu Val Glu 
130 

Asp Ala Pro Arg 
145 



Asp Glu Glu Arg 
5 

Gly Leu Lys Gly 



Gly Leu Arg Pro 
40 

Gly Pro Val Arg 
55 

Val Val Pro Asp 
70 

Pro Val Leu Glu 
85 

He Thr He Val 



Leu Arg Ser Pro 
120 

Glu Gly Val Gly 

135 

Gly Asp Val Pro 
150 



Asp Val Gly Glu 
10 

Arg Val Asp Thr 
25 

Glu Met Phe Ala 



Gly Pro Gly Ala 
60 

Ala Lys Gly Leu 
75 

Leu Leu Tyr Glu 
90 

Thr Tyr Ser Val 
105 

Leu Asp Thr He 



Ala Glu He Ala 
14 0 

Pro Tyr Ala Leu 
155 



Val Gly Tyr Gly 
15 

Pro He Glu Phe 
30 

Asp He Asn Ala 
45 

His Ser Gin Ala 



Leu Lys Ala Ala 
80 

Pro Gly Lys Ala 
95 

Leu Ala Ala Leu 
110 

Ser Val Lys Thr 
125 

Glu Phe Tyr Leu 



Glu Pro Gin He 
160 



Ala Lys 



<210> 51 
<211> 352 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 51 

Met Ala Ala Leu Leu Gly Arg Ser Cys Arg Lys Leu Ser Phe Pro Ser 
15 10 15 

Leu Thr His Gly Ala Arg Arg Val Ser Thr Glu Thr Gly Lys Pro Leu 
20 25 30 



Asn Leu Tyr Ser Ala He Asn Gin Ala Leu His He Ala Leu Asp Thr 
35 40 45 



Asp Pro Arg Ser Tyr 
50 

Phe Arg Cys Thr Thr 
65 

Phe Asn Thr Pro Leu 
85 

Leu Ala Ala Met Gly 
100 

Tyr lie Tyr Pro Ala 
115 

Arg Tyr Arg Ser Gly 
130 

Ala Pro Tyr Gly Ala 
145 

Pro Glu Ala Phe Phe 
165 

Arg Ser Pro Arg Glu 
180 

Pro Asn Pro Val Val 
195 

Val Glu Glu Val Pro 
210 

Glu Val He Arg Glu 
225 

Gin Leu Thr Val Met 
245 

He Ser Cys Glu Leu 
260 

Glu Thr Val Glu Ala 
275 

His Glu Ala Pro Val 
290 



Val Phe Gly Glu Asp Val 
55 

Gly Leu Ala Glu Arg Phe 
70 75 

Cys Glu Gin Gly He Val 
90 

Asn Arg Ala He Val Glu 
105 

Phe Asp Gin He Val Asn 
12 0 

Asn Gin Phe Asn Cys Gly 
135 

Val Gly His Gly Gly His 
150 155 

Cys His Val Pro Gly He 
170 

Ala Lys Gly Leu Leu Leu 
185 

Phe Phe Glu Pro Lys Trp 
200 

Glu His Asp Tyr Met He 
215 

Gly Asn Asp He Thr Leu 
230 235 

Glu Gin Ala Cys Leu Asp 
250 

He Asp Leu Lys Thr Leu 
265 

Ser Val Lys Lys Thr Gly 
280 

Thr Gly Gly Phe Gly Ala 
295 



Gly Phe Gly Gly Val 
60 

Gly Lys Asn Arg Val 
80 

Gly Phe Gly He Gly 
95 

He Gin Phe Ala Asp 
110 

Glu Ala Ala Lys Phe 
125 

Gly Leu Thr He Arg 
140 

Tyr His Ser Gin Ser 
160 

Lys Val Val He Pro 
175 

Ser Cys He Arg Asp 
190 

Leu Tyr Arg Gin Ala 
205 

Pro Leu Ser Glu Ala 
220 

Val Gly Trp Gly Ala 
240 

Ala Glu Lys Glu Gly 
255 

Leu Pro Trp Asp Lys 
270 

Arg Leu Leu He Ser 
285 

Glu He Ser Ala Thr 
300 
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lie Leu Glu Arg Cys Phe Leu Lys Leu Glu Ala Pro Val Ser Arg Val 
305 310 315 320 



Cys Gly Leu Asp 



Pro Thr Lys Asn 
340 



Thr Pro Phe Pro 
325 

Lys lie Leu Asp 



Leu Val Phe Glu 
330 

Ala lie Lys Ser 
345 



Pro Phe Tyr Met 
335 

Thr Val Asn Tyr 
350 



<210> 52 
<211> 392 
<212> PRT 
<213> Human 

<400> 52 
Met Ala Val Val 
1 

Ala Gly Ala Glu 
20 

Arg Gly Phe Leu 
35 

Arg Gin Val Ala 
50 

Tyr Gly Gin Thr 
65 

Leu Asp Asn Ser 



Asp Val Ala Phe 
100 

Lys Tyr Gly Lys 
115 

lie Val Gly Phe 
130 

Ala Glu He Gin 



Ala Ala Ala Ala 
5 

Gly His Trp Arg 



His Pro Ala Ala 
40 

His Phe Thr Phe 
55 

Gin Lys Met Asn 
70 

Leu Ala Lys Asp 
85 

Gly Gly Val Phe 



Asp Arg Val Phe 
120 

Gly He Gly He 
135 

Phe Ala Asp Tyr 



Gly Trp Leu Leu 
10 

Arg Leu Pro Gly 
25 

Thr Val Glu Asp 



Gin Pro Asp Pro 
60 

Leu Phe Gin Ser 
75 

Pro Thr Ala Val 
90 

Arg Cys Thr Val 
105 

Asn Thr Pro Leu 



Ala Val Thr Gly 
14 0 

He Phe Pro Ala 



Arg Leu Arg Ala 
15 

Ala Gly Leu Ala 
30 

Ala Ala Gin Arg 
45 

Glu Pro Arg Glu 



Val Thr Ser Ala 
80 

He Phe Gly Glu 
95 

Gly Leu Arg Asp 
110 

Cys Glu Gin Gly 
125 

Ala Thr Ala He 



Phe Asp Gin He 



50 



145 



150 



155 



160 



Val Asn Glu Ala Ala 
165 

Cys Gly Ser Leu Thr 
180 

Ala Leu Tyr His Ser 
195 

Gly lie Lys Val Val 
210 

Leu Leu Ser Cys lie 
225 

Lys lie Leu Tyr Arg 
245 

Asn lie Pro Leu Ser 
260 

Thr Leu Val Ala Trp 
275 

Ser Met Ala Lys Glu 
290 

Arg Thr lie lie Pro 
305 

Lys Ser Gly Arg Leu 
325 

Phe Ala Ser Glu He 
340 

Leu Glu Ala Pro He 
355 

His He Phe Glu Pro 
370 

Ala Leu Arg Lys Met 
385 



Lys Tyr Arg Tyr Arg Ser 
170 

He Arg Ser Pro Trp Gly 
185 

Gin Ser Pro Glu Ala Phe 
200 

He Pro Arg Ser Pro Phe 
215 

Glu Asp Lys Asn Pro Cys 
230 235 

Ala Ala Ala Glu Glu Val 
250 

Gin Ala Glu Val He Gin 
265 

Gly Thr Gin Val His Val 
280 

Lys Leu Gly Val Ser Cys 
295 

Trp Asp Val Asp Thr He 
310 315 

Leu He Ser His Glu Ala 
330 

Ser Ser Thr Val Gin Glu 
345 

Ser Arg Val Cys Gly Tyr 
360 

Phe Tyr He Pro Asp Lys 
375 

He Asn Tyr 
390 



Gly Asp Leu Phe Asn 
175 

Cys Val Gly His Gly 
190 

Phe Ala His Cys Pro 
205 

Gin Ala Lys Gly Leu 
220 

He Phe Phe Glu Pro 
240 

Pro He Glu Pro Tyr 
255 

Glu Gly Ser Asp Val 
270 

He Arg Glu Val Ala 
285 

Glu Val He Asp Leu 
300 

Cys Lys Ser Val He 
320 

Pro Leu Thr Gly Gly 
335 

Glu Cys Phe Leu Asn 
350 

Asp Thr Pro Phe Pro 
365 

Trp Lys Cys Tyr Asp 
380 
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<210> 53 
<211> 391 
<212> PRT 
<213> Bovine 

<400> 53 

Met Ala Ala Val Ala Ala Phe Ala Gly Trp Leu Leu Arg Leu Arg Ala 



Ala Gly Ala Asp Gly Pro Trp Arg Arg Leu Cys Gly Ala Gly Leu Ser 



Arg Gly Phe Leu Gin Ser Ala Ser Ala Tyr Gly Ala Ala Gin Arg Arg 



Gin Val Ala His Phe Thr Phe Gin Pro Asp Pro Glu Pro Val Glu Tyr 



Gly Gin Thr Gin Lys Met Asn Leu Phe Gin Ala Val Thr Ser Ala Leu 



Asp Asn Ser Leu Ala Lys Asp Pro Thr Ala Val lie Phe Gly Glu Asp 



Val Ala Phe Gly Gly Val Phe Arg Cys Thr Val Gly Leu Arg Asp Lys 
100 105 110 

Tyr Gly Lys Asp Arg Val Phe Asn Thr Pro Leu Cys Glu Gin Gly lie 
115 120 125 

Val Gly Phe Gly lie Gly lie Ala Val Thr Gly Ala Thr Ala He Ala 
130 135 140 

Glu He Gin Phe Ala Asp Tyr He Phe Pro Ala Phe Asp Gin He Val 
145 150 155 160 

Asn Glu Ala Ala Lys Tyr Arg Tyr Arg Ser Gly Asp Leu Phe Asn Cys 
165 170 175 

Gly Ser Leu Thr He Arg Ser Pro Trp Gly Cys Val Gly His Gly Ala 
180 185 190 

Leu Tyr His Ser Gin Ser Pro Glu Ala Phe Phe Ala His Cys Pro Gly 
195 200 205 

He Lys Val Val Val Pro Arg Ser Pro Phe Gin Ala Lys Gly Leu Leu 
210 215 220 
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Leu Ser Cys lie Glu Asp Lys Asn Pro Cys lie Phe Phe Glu Pro Lys 
225 230 235 240 



lie Leu Tyr Arg 



lie Pro Leu Ser 
260 

Leu Val Ala Trp 
275 

Met Ala Gin Glu 
290 

Thr lie Leu Pro 
305 

Thr Gly Arg Leu 



Ala Ser Glu lie 
340 

Glu Ala Pro He 
355 

He Phe Glu Pro 
370 

Leu Arg Lys Met 
385 



Ala Ala Val Glu 
245 

Gin Ala Glu Val 



Gly Thr Gin Val 
280 

Lys Leu Gly Val 
295 

Trp Asp Val Asp 
310 

Leu Val Ser His 
325 

Ser Ser Thr Val 



Ser Arg Val Cys 
360 

Phe Tyr He Pro 
375 

He Asn Tyr 
390 



Gin Val Pro Val 
250 

He Gin Glu Gly 
265 

His Glu He Arg 



Ser Cys Glu Val 
300 

Thr Val Cys Lys 
315 

Glu Ala Pro Leu 
330 

Gin Glu Gin Cys 
345 

Gly Tyr Asp Thr 



Asp Lys Trp Lys 
380 



Glu Pro Tyr Asn 
255 

Ser Asp Val Thr 
270 

Glu Val Ala Ala 
285 

He Asp Leu Arg 



Ser Val He Lys 
320 

Thr Gly Gly Phe 
335 

Phe Leu Asn Leu 
350 

Pro Phe Pro His 
365 

Cys Tyr Asp Ala 



<210> 54 
<211> 375 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: consensus 
<400> 54 

Met Ala Ala Val Ala Ala Ala Gly Trp Leu Leu Arg Leu Arg Ala Ala 
15 10 15 

Gly Ala Gly Trp Arg Arg Leu Gly Ala Gly Leu Arg Gly Phe Leu Ala 
20 25 30 
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Ala Ala Gin Arg Arg Gin Val Ala His Phe Thr Phe Gin Pro Asp Pro 
35 40 45 



Glu Pro Glu Tyr Gly Gin Thr Gin Lys Met Asn Leu Phe Gin Ala Val 



Thr Ser Ala Leu Asp Asn Ser Leu Ala Lys Asp Pro Thr Ala Val lie 
65 70 75 80 

Phe Gly Glu Asp Val Ala Phe Gly Gly Val Phe Arg Cys Thr Val Gly 
85 90 95 

Leu Arg Asp Lys Tyr Gly Lys Asp Arg Val Phe Asn Thr Pro Leu Cys 
100 105 110 

Glu Gin Gly lie Val Gly Phe Gly lie Gly lie Ala Val Thr Gly Ala 
115 120 125 

Thr Ala lie Ala Glu lie Gin Phe Ala Asp Tyr lie Phe Pro Ala Phe 
130 135 140 

Asp Gin lie Val Asn Glu Ala Ala Lys Tyr Arg Tyr Arg Ser Gly Asp 
145 150 155 160 

Leu Phe Asn Cys Gly Ser Leu Thr lie Arg Ser Pro Trp Gly Cys Val 
165 170 175 

Gly His Gly Ala Leu Tyr His Ser Gin Ser Pro Glu Ala Phe Phe Ala 
180 185 190 

His Cys Pro Gly lie Lys Val Val lie Pro Arg Ser Pro Phe Gin Ala 
195 200 205 

Lys Gly Leu Leu Leu Ser Cys lie Glu Asp Lys Asn Pro Cys lie Phe 
210 215 220 

Phe Glu Pro Lys lie Leu Tyr Arg Ala Ala Val Glu Glu Val Pro Glu 
225 230 235 240 

Pro Tyr Asn lie Pro Leu Ser Gin Ala Glu Val lie Gin Glu Gly Ser 
245 250 255 

Asp Val Thr Leu Val Ala Trp Gly Thr Gin Val His Val lie Arg Glu 
260 265 270 

Val Ala Met Ala Glu Lys Leu Gly Val Ser Cys Glu Val lie Asp Leu 
275 280 285 
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Arg Thr lie Leu 
290 

Lys Thr Gly Arg 
305 

Phe Ala Ser Glu 



Glu Ala Pro lie 
340 

lie Phe Glu Pro 
355 

Leu Arg Lys Met 
370 



Pro Trp Asp Val 
295 

Leu Leu lie Ser 
310 

lie Ser Ser Thr 
325 

Ser Arg Val Cys 



Phe Tyr lie Pro 
360 

lie Asn Tyr 
375 



Asp Thr Val Cys 
300 

His Glu Ala Pro 
315 

Val Gin Glu Cys 
330 

Gly Tyr Asp Thr 
345 

Asp Lys Trp Lys 



Lys Ser Val He 



Leu Thr Gly Gly 
320 

Phe Leu Asn Leu 
335 

Pro Phe Pro His 
350 

Cys Tyr Asp Ala 
365 
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Full name of third joint inventor Mark L. Johnston 



Third inventor's signature 

Residence Gales Ferry, Connecticut 

Post Office address 22 Oak Ridge 



Gales Ferry, Connecticut 0 6335 



Date 

Citizenship USA 



Send Correspondence To: 


Direct Telephone Calls To: 


Customer Number: 000321 


Charles E. Cohen 




(314) 231-5400 



I hereby declare that all statements made herein of my own knowledge are 
true and that all statements made on information and belief are believed to 
be true; and further that these statements were made with the knowledge that 
willful false statements and the like so made are punishable by fine or 
imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code and that such willful false statements may jeopardize the validity of 
the application or any patent issued thereon. 



Full name of sole or first inventor Douglas D. Randall 

-Inventor's signature Date 

Residence Columbia, Missouri Citizenship USA 

Fost Office address 2 07 Rockingham Dr. 

J_ Columbia, Missouri 65203 



'mill name of second joint inventor Brian P. Mooney 



Second inventor's signature 

Residence Columbia, 



Date 



Missouri 



Post Office address 1133 Ashland, Apt. 1116 



Citizenship Ireland 



Columbia, Missouri 65201 



Full name of third joint inventor Mark L. Johnston 



Third inventor's signature 
Residence Gales Ferry, 




Connecticut 



Post Office address 22 Oak Ridge 



Date 

Citizenship USA 



Gales Ferry, Connecticut 0 6335 



Full name of fourth, joint inventor Michael H. Luethy 



Fourth inventor's signature yp^^/^f <^A^c^>^-^ Date ''/*~/?<? 

Residence Old Cystic, Connecticut Citizenship USA 

Post Office address P.O. Box 298 

Old Mystic, Connecticut 06372 



Full name of fifth joint inventor Jan A. Miernyk 



Fifth inventor's signature Date 

Residence Peoria, Illinois Citizenship USA 

Post Office address 2008 West Clark Street 

Peoria, Illinois 61604 
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Full name of fourth joint inventor Michael H. Luethy 



Fourth inventor's signature Date 

Residence Old Mystic, Connecticut Citizenship USA 

Post Office address P.O. Box 298 

■ Old Mystic, Connecticut 0 63 72 



Full name of fifth joint inventor 



Fifth inventor's signature 
Residence Peor; 



Jan A. Miernyk 



Post Office address 20 08 West Clark 




Date 'm^ff 



Citizenship USA 



Peoria, Illinois 61604 
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